SAVEHR: Self Attention Vector Representations for EHR based Personalized Chronic Disease Onset Prediction and Interpretability
It addresses personalized chronic disease prediction for healthcare providers, offering a method that is incremental but improves performance, generalization, and interpretability.
The paper tackled predicting chronic disease onset from EHR data using a self-attention architecture, achieving >0.51 AUC-PR and >0.87 AUC-ROC gains for four conditions 15 months in advance and transferring well to new populations.
Chronic disease progression is emerging as an important area of investment for healthcare providers. As the quantity and richness of available clinical data continue to increase along with advances in machine learning, there is great potential to advance our approaches to caring for patients. An ideal approach to this problem should generate good performance on at least three axes namely, a) perform across many clinical conditions without requiring deep clinical expertise or extensive data scientist effort, b) generalization across populations, and c) be explainable (model interpretability). We present SAVEHR, a self-attention based architecture on heterogeneous structured EHR data that achieves $>$ 0.51 AUC-PR and $>$ 0.87 AUC-ROC gains on predicting the onset of four clinical conditions (CHF, Kidney Failure, Diabetes and COPD) 15-months in advance, and transfers with high performance onto a new population. We demonstrate that SAVEHR model performs superior to ten baselines on all three axes stated formerly.