ALPEC: A Comprehensive Evaluation Framework and Dataset for Machine Learning-Based Arousal Detection in Clinical Practice
This work addresses the gap between technological advancements and clinical needs for sleep disorder diagnosis, though it is incremental in refining evaluation methods and dataset availability.
The paper tackles the mismatch between clinical protocols and machine learning methods for arousal detection in sleep disorders by introducing the ALPEC framework, which focuses on detecting arousal onsets and includes a novel multimodal dataset, leading to improved alignment with clinical practice.
Detecting arousals in sleep is essential for diagnosing sleep disorders. However, using Machine Learning (ML) in clinical practice is impeded by fundamental issues, primarily due to mismatches between clinical protocols and ML methods. Clinicians typically annotate only the onset of arousals, while ML methods rely on annotations for both the beginning and end. Additionally, there is no standardized evaluation methodology tailored to clinical needs for arousal detection models. This work addresses these issues by introducing a novel post-processing and evaluation framework emphasizing approximate localization and precise event count (ALPEC) of arousals. We recommend that ML practitioners focus on detecting arousal onsets, aligning with clinical practice. We examine the impact of this shift on current training and evaluation schemes, addressing simplifications and challenges. We utilize a novel comprehensive polysomnographic dataset (CPS) that reflects the aforementioned clinical annotation constraints and includes modalities not present in existing polysomnographic datasets. We release the dataset alongside this paper, demonstrating the benefits of leveraging multimodal data for arousal onset detection. Our findings significantly contribute to integrating ML-based arousal detection in clinical settings, reducing the gap between technological advancements and clinical needs.