QMLGSPNCMar 13, 2025

BioSerenity-E1: a self-supervised EEG model for medical applications

arXiv:2503.10362v15 citationsh-index: 4
Originality Incremental advance
AI Analysis

This addresses the scarcity of specialized expertise for EEG analysis in neurology, though it is incremental as it builds on existing self-supervised learning methods.

The paper tackles the problem of automating EEG interpretation for medical diagnostics by introducing BioSerenity-E1, a self-supervised foundation model that achieves state-of-the-art performance on tasks like seizure detection (AUROC = 0.926) and normal/abnormal classification (AUPRC = 0.970), with significant gains in low-data scenarios.

Electroencephalography (EEG) serves as an essential diagnostic tool in neurology; however, its accurate manual interpretation is a time-intensive process that demands highly specialized expertise, which remains relatively scarce and not consistently accessible. To address these limitations, the implementation of automated pre-screening and analysis systems for EEG data holds considerable promise. Advances in self-supervised learning made it possible to pre-train complex deep learning architectures on large volumes of unlabeled EEG data to learn generalizable representations, that can later be used to enhance performance on multiple tasks while needing less downstream data. In the present paper, we introduce BioSerenity-E1, the first of a family of self-supervised foundation models for clinical EEG applications that combines spectral tokenization with masked prediction to achieve state-of-the-art performance across relevant diagnostic tasks. The two-phase self-supervised pretraining framework initially acquires compressed EEG representations via a transformer-based VQ-VAE architecture designed to reconstruct log-multitaper spectral projections, then implements extensive (70% block) masked token prediction to force the model to learn complex spatiotemporal dependencies in EEG signals. BioSerenity-E1 achieves strong performance across three clinical tasks, either in line or above state-of-the-art methods: seizure detection (AUROC = 0.926, Sensitivity = 0.909), normal/abnormal classification (AUPRC = 0.970 on proprietary data; 0.910 on TUH-Abnormal), and multiclass pathology differentiation on unbalanced data (Weighted F1 = 0.730). The utility of BioSerenity-E1 is further confirmed in low-data regimes scenarios, showing clear improvements in AUPRC (from +2% to 17%) when trained on less than 10% of the available data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes