LGJan 12, 2023

SACDNet: Towards Early Type 2 Diabetes Prediction with Uncertainty for Electronic Health Records

arXiv:2301.04844v24 citationsh-index: 19

AI Analysis

This addresses early diagnosis for Type 2 diabetes patients to prevent complications, but it is incremental as it builds on existing methods with modest gains.

The study tackled early Type 2 diabetes prediction from electronic health records by proposing SACDNet, a neural network with self-attention and dense layers, achieving 89.3% accuracy and 89.1% F1-score, with a 1.6% accuracy improvement over baselines.

Type 2 diabetes mellitus (T2DM) is one of the most common diseases and a leading cause of death. The problem of early diagnosis of T2DM is challenging and necessary to prevent serious complications. This study proposes a novel neural network architecture for early T2DM prediction using multi-headed self-attention and dense layers to extract features from historic diagnoses, patient vitals, and demographics. The proposed technique is called the Self-Attention for Comorbid Disease Net (SACDNet), achieving an accuracy of 89.3% and an F1-Score of 89.1%, having a 1.6% increased accuracy and 1.3% increased f1-score compared to the baseline techniques. Monte Carlo (MC) Dropout is applied to the SACDNet to get a bayesian approximation. A T2DM prediction framework based on the MC Dropout SACDNet is proposed to quantize the uncertainty associated with the predictions. A T2DM prediction dataset is also built as part of this study which is based on real-world routine Electronic Health Record (EHR) data comprising 4,124 diabetic and 181,767 non-diabetic examples, collected from 295 different EHR systems running in different parts of the United States of America. This dataset is further used to evaluate 7 different machine learning and 3 deep learning-based models. Finally, a detailed analysis of the fairness of every technique against different patient demographic groups is performed to validate the unbiased generalization of the techniques and the diversity of the data.

View on arXiv PDF

Similar