AI Generalisation Gap In Comorbid Sleep Disorder Staging
This work addresses the challenge of deploying automated sleep staging in clinical settings for patients with comorbidities, highlighting the need for disease-specific models, but it is incremental as it primarily demonstrates a known limitation without introducing a novel solution.
The paper tackles the problem of poor generalization of deep learning sleep staging models from healthy subjects to clinical populations with disrupted sleep, such as stroke patients, and finds that cross-domain performance is poor, with models focusing on uninformative EEG regions in patient data.
Accurate sleep staging is essential for diagnosing OSA and hypopnea in stroke patients. Although PSG is reliable, it is costly, labor-intensive, and manually scored. While deep learning enables automated EEG-based sleep staging in healthy subjects, our analysis shows poor generalization to clinical populations with disrupted sleep. Using Grad-CAM interpretations, we systematically demonstrate this limitation. We introduce iSLEEPS, a newly clinically annotated ischemic stroke dataset (to be publicly released), and evaluate a SE-ResNet plus bidirectional LSTM model for single-channel EEG sleep staging. As expected, cross-domain performance between healthy and diseased subjects is poor. Attention visualizations, supported by clinical expert feedback, show the model focuses on physiologically uninformative EEG regions in patient data. Statistical and computational analyses further confirm significant sleep architecture differences between healthy and ischemic stroke cohorts, highlighting the need for subject-aware or disease-specific models with clinical validation before deployment. A summary of the paper and the code is available at https://himalayansaswatabose.github.io/iSLEEPS_Explainability.github.io/