An Explainable Ensemble Framework for Alzheimer's Disease Prediction Using Structured Clinical and Cognitive Data
This framework offers a reliable and transparent approach for early Alzheimer's disease prediction, potentially aiding clinical decision support for medical professionals.
This research developed an explainable ensemble learning framework to classify individuals with Alzheimer's disease using structured clinical and cognitive data. The ensemble methods, particularly XGBoost, Random Forest, and Soft Voting, achieved superior accuracy, sensitivity, and F1-score compared to deep learning approaches.
Early and accurate detection of Alzheimer's disease (AD) remains a major challenge in medical diagnosis due to its subtle onset and progressive nature. This research introduces an explainable ensemble learning Framework designed to classify individuals as Alzheimer's or Non-Alzheimer's using structured clinical, lifestyle, metabolic, and lifestyle features. The workflow incorporates rigorous preprocessing, advanced feature engineering, SMOTE-Tomek hybrid class balancing, and optimized modeling using five ensemble algorithms-Random Forest, XGBoost, LightGBM, CatBoost, and Extra Trees-alongside a deep artificial neural network. Model selection was performed using stratified validation to prevent leakage, and the best-performing model was evaluated on a fully unseen test set. Ensemble methods achieved superior performance over deep learning, with XGBoost, Random Forest, and Soft Voting showing the strongest accuracy, sensitivity, and F1-score profiles. Explainability techniques, including SHAP and feature importance analysis, highlighted MMSE, Functional Assessment Age, and several engineered interaction features as the most influential determinants. The results demonstrate that the proposed framework provides a reliable and transparent approach to Alzheimer's disease prediction, offering strong potential for clinical decision support applications.