LG AISep 15, 2025

Early Prediction of Multi-Label Care Escalation Triggers in the Intensive Care Unit Using Electronic Health Records

Syed Ahmad Chan Bukhari, Amritpal Singh, Shifath Hossain, Iram Wajahat

arXiv:2509.18145v14.1h-index: 4HealthCom

Originality Incremental advance

AI Analysis

This addresses the need for early, interpretable alerts for ICU clinicians to improve timely care escalation, though it is incremental as it applies existing machine learning methods to a new multi-label task in healthcare.

This study tackled the problem of predicting multiple overlapping signs of physiological deterioration in ICU patients by developing a multi-label classification framework for Care Escalation Triggers (CETs) using the first 24 hours of electronic health record data, achieving F1-scores ranging from 0.62 to 0.76 across different CETs with XGBoost as the best model.

Intensive Care Unit (ICU) patients often present with complex, overlapping signs of physiological deterioration that require timely escalation of care. Traditional early warning systems, such as SOFA or MEWS, are limited by their focus on single outcomes and fail to capture the multi-dimensional nature of clinical decline. This study proposes a multi-label classification framework to predict Care Escalation Triggers (CETs), including respiratory failure, hemodynamic instability, renal compromise, and neurological deterioration, using the first 24 hours of ICU data. Using the MIMIC-IV database, CETs are defined through rule-based criteria applied to data from hours 24 to 72 (for example, oxygen saturation below 90, mean arterial pressure below 65 mmHg, creatinine increase greater than 0.3 mg/dL, or a drop in Glasgow Coma Scale score greater than 2). Features are extracted from the first 24 hours and include vital sign aggregates, laboratory values, and static demographics. We train and evaluate multiple classification models on a cohort of 85,242 ICU stays (80 percent training: 68,193; 20 percent testing: 17,049). Evaluation metrics include per-label precision, recall, F1-score, and Hamming loss. XGBoost, the best performing model, achieves F1-scores of 0.66 for respiratory, 0.72 for hemodynamic, 0.76 for renal, and 0.62 for neurologic deterioration, outperforming baseline models. Feature analysis shows that clinically relevant parameters such as respiratory rate, blood pressure, and creatinine are the most influential predictors, consistent with the clinical definitions of the CETs. The proposed framework demonstrates practical potential for early, interpretable clinical alerts without requiring complex time-series modeling or natural language processing.

View on arXiv PDF

Similar