ProtoEHR: Hierarchical Prototype Learning for EHR-based Healthcare Predictions
This work addresses the need for better interpretable AI solutions in healthcare prediction for clinicians and researchers, though it appears incremental as it builds on existing prototype learning and hierarchical modeling approaches.
The authors tackled the problem of limited predictive performance and interpretability in EHR-based healthcare predictions by proposing ProtoEHR, a hierarchical prototype learning framework that exploits multi-level EHR data, resulting in accurate, robust, and interpretable predictions across five clinical tasks on two public datasets.
Digital healthcare systems have enabled the collection of mass healthcare data in electronic healthcare records (EHRs), allowing artificial intelligence solutions for various healthcare prediction tasks. However, existing studies often focus on isolated components of EHR data, limiting their predictive performance and interpretability. To address this gap, we propose ProtoEHR, an interpretable hierarchical prototype learning framework that fully exploits the rich, multi-level structure of EHR data to enhance healthcare predictions. More specifically, ProtoEHR models relationships within and across three hierarchical levels of EHRs: medical codes, hospital visits, and patients. We first leverage large language models to extract semantic relationships among medical codes and construct a medical knowledge graph as the knowledge source. Building on this, we design a hierarchical representation learning framework that captures contextualized representations across three levels, while incorporating prototype information within each level to capture intrinsic similarities and improve generalization. To perform a comprehensive assessment, we evaluate ProtoEHR in two public datasets on five clinically significant tasks, including prediction of mortality, prediction of readmission, prediction of length of stay, drug recommendation, and prediction of phenotype. The results demonstrate the ability of ProtoEHR to make accurate, robust, and interpretable predictions compared to baselines in the literature. Furthermore, ProtoEHR offers interpretable insights on code, visit, and patient levels to aid in healthcare prediction.