CVAIOct 30, 2024

DiaMond: Dementia Diagnosis with Multi-Modal Vision Transformers Using MRI and PET

arXiv:2410.23219v114 citationsh-index: 8Has CodeWACV
Originality Incremental advance
AI Analysis

This work addresses the clinical challenge of accurate dementia diagnosis for patients, though it appears incremental as it builds on existing multi-modal methods with specific improvements.

The paper tackled the problem of diagnosing dementia by integrating MRI and PET data using a novel multi-modal vision Transformer framework, achieving a balanced accuracy of 92.4% for Alzheimer's Disease diagnosis and 76.5% for differential diagnosis between Alzheimer's and frontotemporal dementia.

Diagnosing dementia, particularly for Alzheimer's Disease (AD) and frontotemporal dementia (FTD), is complex due to overlapping symptoms. While magnetic resonance imaging (MRI) and positron emission tomography (PET) data are critical for the diagnosis, integrating these modalities in deep learning faces challenges, often resulting in suboptimal performance compared to using single modalities. Moreover, the potential of multi-modal approaches in differential diagnosis, which holds significant clinical importance, remains largely unexplored. We propose a novel framework, DiaMond, to address these issues with vision Transformers to effectively integrate MRI and PET. DiaMond is equipped with self-attention and a novel bi-attention mechanism that synergistically combine MRI and PET, alongside a multi-modal normalization to reduce redundant dependency, thereby boosting the performance. DiaMond significantly outperforms existing multi-modal methods across various datasets, achieving a balanced accuracy of 92.4% in AD diagnosis, 65.2% for AD-MCI-CN classification, and 76.5% in differential diagnosis of AD and FTD. We also validated the robustness of DiaMond in a comprehensive ablation study. The code is available at https://github.com/ai-med/DiaMond.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes