SP LGSep 23, 2024

Designing Pre-training Datasets from Unlabeled Data for EEG Classification with Transformers

arXiv:2410.07190v11.2h-index: 2

Originality Incremental advance

AI Analysis

This addresses the costly annotation issue in medical EEG analysis for tasks like epileptic seizure forecasting, though it is incremental as it adapts existing self-supervised methods to a specific domain.

The paper tackles the problem of scarce labeled data in EEG classification by designing pre-training datasets from unlabeled EEG data, resulting in models that reduce fine-tuning time by over 50% and improve accuracy from 90.93% to 92.16% with AUC increasing from 0.9648 to 0.9702.

Transformer neural networks require a large amount of labeled data to train effectively. Such data is often scarce in electroencephalography, as annotations made by medical experts are costly. This is why self-supervised training, using unlabeled data, has to be performed beforehand. In this paper, we present a way to design several labeled datasets from unlabeled electroencephalogram (EEG) data. These can then be used to pre-train transformers to learn representations of EEG signals. We tested this method on an epileptic seizure forecasting task on the Temple University Seizure Detection Corpus using a Multi-channel Vision Transformer. Our results suggest that 1) Models pre-trained using our approach demonstrate significantly faster training times, reducing fine-tuning duration by more than 50% for the specific task, and 2) Pre-trained models exhibit improved accuracy, with an increase from 90.93% to 92.16%, as well as a higher AUC, rising from 0.9648 to 0.9702 when compared to non-pre-trained models.

View on arXiv PDF

Similar