Towards Neural Foundation Models for Vision: Aligning EEG, MEG, and fMRI Representations for Decoding, Encoding, and Modality Conversion
This work addresses the challenge of integrating multimodal brain imaging data for neuroscience and AI applications, representing an incremental step towards foundational models in this domain.
The paper tackled the problem of aligning neural data and visual stimuli across EEG, MEG, and fMRI modalities using contrastive learning, resulting in a model that accurately captures semantic information for decoding, encoding, and modality conversion tasks.
This paper presents a novel approach towards creating a foundational model for aligning neural data and visual stimuli across multimodal representationsof brain activity by leveraging contrastive learning. We used electroencephalography (EEG), magnetoencephalography (MEG), and functional magnetic resonance imaging (fMRI) data. Our framework's capabilities are demonstrated through three key experiments: decoding visual information from neural data, encoding images into neural representations, and converting between neural modalities. The results highlight the model's ability to accurately capture semantic information across different brain imaging techniques, illustrating its potential in decoding, encoding, and modality conversion tasks.