CLEF: EEG Foundation Model for Learning Clinical Semantics
For clinical EEG analysis, this work provides a foundation model that integrates full-session context and clinical semantics, significantly outperforming prior short-window models on a broad range of tasks.
CLEF introduces a clinically grounded long-context EEG foundation model that represents full EEG sessions as 3D spectrogram tokens and aligns embeddings with clinical reports and EHR data. On a 234-task benchmark, CLEF outperforms prior models on 229 tasks, improving mean AUROC from 0.65 to 0.74.
Clinical EEG interpretation requires reasoning over full EEG sessions and integrating signal patterns with clinical context. Existing EEG foundation models are largely designed for short-window decoding and do not incorporate clinical context. We introduce CLEF, a clinically grounded long-context EEG foundation model. CLEF represents EEG sessions as 3D multitaper spectrogram tokens, enabling tractable Transformer modeling at session scale, and aligns embeddings with neurologist reports and structured EHR data through contrastive objectives. We evaluate CLEF on a new 234-task benchmark spanning disease phenotypes, medication exposures, and EEG findings, with more than 260k EEG sessions from over 108k patients. CLEF outperforms prior EEG foundation models on 229 of 234 tasks, improving mean AUROC from 0.65 to 0.74. Reconstruction-only pretraining surpasses prior EEG foundation models, while report and EHR alignment yields further gains. Held-out concept and external-cohort experiments suggest that these representations transfer beyond observed alignment targets. These results support session-scale, clinically grounded representation learning as a promising foundation-model paradigm for clinical EEG.