EEG-Language Pretraining for Highly Label-Efficient Clinical Phenotyping
This work addresses clinical phenotyping for medical applications, representing a novel domain application with incremental methodological extensions.
The paper tackles the problem of clinical phenotyping with limited labeled data by pioneering EEG-language models (ELMs) trained on 15,000 EEGs and clinical reports, achieving significant improvements over EEG-only models across four evaluations and enabling zero-shot classification and retrieval.
Multimodal language modeling has enabled breakthroughs for representation learning, yet remains unexplored in the realm of functional brain data for clinical phenotyping. This paper pioneers EEG-language models (ELMs) trained on clinical reports and 15000 EEGs. We propose to combine multimodal alignment in this novel domain with timeseries cropping and text segmentation, enabling an extension based on multiple instance learning to alleviate misalignment between irrelevant EEG or text segments. Our multimodal models significantly improve over EEG-only models across four clinical evaluations and for the first time enable zero-shot classification as well as retrieval of both neural signals and reports. In sum, these results highlight the potential of ELMs, representing significant progress for clinical applications.