Self-supervised Learning for Label-Efficient Sleep Stage Classification: A Comprehensive Evaluation
This addresses the challenge of expensive and time-consuming data labeling in sleep labs, offering a practical solution for real-world applications, though it is incremental as it applies existing SSL techniques to a specific domain.
The paper tackles the problem of limited labeled data for EEG-based sleep stage classification by evaluating self-supervised learning (SSL) to boost model performance with few labels, finding that fine-tuning pretrained models with only 5% of labeled data achieves competitive results compared to full supervised training.
The past few years have witnessed a remarkable advance in deep learning for EEG-based sleep stage classification (SSC). However, the success of these models is attributed to possessing a massive amount of labeled data for training, limiting their applicability in real-world scenarios. In such scenarios, sleep labs can generate a massive amount of data, but labeling these data can be expensive and time-consuming. Recently, the self-supervised learning (SSL) paradigm has shined as one of the most successful techniques to overcome the scarcity of labeled data. In this paper, we evaluate the efficacy of SSL to boost the performance of existing SSC models in the few-labels regime. We conduct a thorough study on three SSC datasets, and we find that fine-tuning the pretrained SSC models with only 5% of labeled data can achieve competitive performance to the supervised training with full labels. Moreover, self-supervised pretraining helps SSC models to be more robust to data imbalance and domain shift problems. The code is publicly available at https://github.com/emadeldeen24/eval_ssl_ssc.