LGDec 31, 2024

Dementia Detection using Multi-modal Methods on Audio Data

arXiv:2501.00465v21 citationsh-index: 3
Originality Synthesis-oriented
AI Analysis

This work addresses early detection of dementia for patients, but it is incremental as it builds on existing methods with a new dataset.

The paper tackled dementia detection by developing a model that uses audio recordings to predict cognitive impairment, achieving an RMSE score of 2.6911, which is about 10% lower than the baseline.

Dementia is a neurodegenerative disease that causes gradual cognitive impairment, which is very common in the world and undergoes a lot of research every year to prevent and cure it. It severely impacts the patient's ability to remember events and communicate clearly, where most variations of it have no known cure, but early detection can help alleviate symptoms before they become worse. One of the main symptoms of dementia is difficulty in expressing ideas through speech. This paper attempts to talk about a model developed to predict the onset of the disease using audio recordings from patients. An ASR-based model was developed that generates transcripts from the audio files using Whisper model and then applies RoBERTa regression model to generate an MMSE score for the patient. This score can be used to predict the extent to which the cognitive ability of a patient has been affected. We use the PROCESS_V1 dataset for this task, which is introduced through the PROCESS Grand Challenge 2025. The model achieved an RMSE score of 2.6911 which is around 10 percent lower than the described baseline.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes