IVAICVMar 30, 2024

A Novel Feature Map Enhancement Technique Integrating Residual CNN and Transformer for Alzheimer Diseases Diagnosis

arXiv:2405.12986v25 citationsh-index: 4
Originality Incremental advance
AI Analysis

This addresses timely diagnosis of Alzheimer's disease for patients and clinicians, representing an incremental improvement over existing deep learning methods.

The paper tackles Alzheimer's disease diagnosis from MRI by proposing a hybrid deep learning method combining residual CNN and Transformer components, achieving an accuracy of 98.42% and F1-score of 98.55% on a standard dataset.

Alzheimer diseases (ADs) involves cognitive decline and abnormal brain protein accumulation, necessitating timely diagnosis for effective treatment. Therefore, CAD systems leveraging deep learning advancements have demonstrated success in AD detection but pose computational intricacies and the dataset minor contrast, structural, and texture variations. In this regard, a novel hybrid FME-Residual-HSCMT technique is introduced, comprised of residual CNN and Transformer concepts to capture global and local fine-grained AD analysis in MRI. This approach integrates three distinct elements: a novel CNN Meet Transformer (HSCMT), customized residual learning CNN, and a new Feature Map Enhancement (FME) strategy to learn diverse morphological, contrast, and texture variations of ADs. The proposed HSCMT at the initial stage utilizes stem convolution blocks that are integrated with CMT blocks followed by systematic homogenous and structural (HS) operations. The customized CMT block encapsulates each element with global contextual interactions through multi-head attention and facilitates computational efficiency through lightweight. Moreover, inverse residual and stem CNN in customized CMT enables effective extraction of local texture information and handling vanishing gradients. Furthermore, in the FME strategy, residual CNN blocks utilize TL-based generated auxiliary and are combined with the proposed HSCMT channels at the target level to achieve diverse enriched feature space. Finally, diverse enhanced channels are fed into a novel spatial attention mechanism for optimal pixel selection to reduce redundancy and discriminate minor contrast and texture inter-class variation. The proposed achieves an F1-score (98.55%), an accuracy of 98.42% and a sensitivity of 98.50%, a precision of 98.60% on the standard Kaggle dataset, and demonstrates outperformance existing ViTs and CNNs methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes