LGCLNov 30, 2020

Multi-Modal Detection of Alzheimer's Disease from Speech and Text

arXiv:2012.00096v327 citations
AI Analysis

This work is significant for the medical community, specifically for neurologists, by offering an incremental improvement in the early and non-invasive detection of Alzheimer's disease.

This paper addresses the challenge of early Alzheimer's disease (AD) detection by proposing a multimodal deep learning method that simultaneously analyzes speech and text. The method achieved 85.3% 10-fold cross-validation accuracy on the Dementiabank Pitt corpus.

Reliable detection of the prodromal stages of Alzheimer's disease (AD) remains difficult even today because, unlike other neurocognitive impairments, there is no definitive diagnosis of AD in vivo. In this context, existing research has shown that patients often develop language impairment even in mild AD conditions. We propose a multimodal deep learning method that utilizes speech and the corresponding transcript simultaneously to detect AD. For audio signals, the proposed audio-based network, a convolutional neural network (CNN) based model, predicts the diagnosis for multiple speech segments, which are combined for the final prediction. Similarly, we use contextual embedding extracted from BERT concatenated with a CNN-generated embedding for classifying the transcript. The individual predictions of the two models are then combined to make the final classification. We also perform experiments to analyze the model performance when Automated Speech Recognition (ASR) system generated transcripts are used instead of manual transcription in the text-based model. The proposed method achieves 85.3% 10-fold cross-validation accuracy when trained and evaluated on the Dementiabank Pitt corpus.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes