Survival Meets Classification: A Novel Framework for Early Risk Prediction Models of Chronic Diseases
This provides a more comprehensive tool for disease risk surveillance in healthcare, though it is incremental as it combines existing techniques.
The authors tackled early risk prediction for five chronic diseases by integrating survival analysis with classification, achieving performance comparable to or better than state-of-the-art models like LightGBM and XGBoost in accuracy, F1 score, and AUROC on real-world EMR data.
Chronic diseases are long-lasting conditions that require lifelong medical attention. Using big EMR data, we have developed early disease risk prediction models for five common chronic diseases: diabetes, hypertension, CKD, COPD, and chronic ischemic heart disease. In this study, we present a novel approach for disease risk models by integrating survival analysis with classification techniques. Traditional models for predicting the risk of chronic diseases predominantly focus on either survival analysis or classification independently. In this paper, we show survival analysis methods can be re-engineered to enable them to do classification efficiently and effectively, thereby making them a comprehensive tool for developing disease risk surveillance models. The results of our experiments on real-world big EMR data show that the performance of survival models in terms of accuracy, F1 score, and AUROC is comparable to or better than that of prior state-of-the-art models like LightGBM and XGBoost. Lastly, the proposed survival models use a novel methodology to generate explanations, which have been clinically validated by a panel of three expert physicians.