CognitionCapturerPro: Towards High-Fidelity Visual Decoding from EEG/MEG via Multi-modal Information and Asymmetric Alignment
This work addresses the problem of low-fidelity visual decoding from brain activity for neuroscience and brain-computer interface applications, representing an incremental advancement over prior methods.
The paper tackled the challenge of reconstructing visual stimuli from EEG signals by proposing CognitionCapturerPro, a framework that integrates multi-modal priors and an asymmetric alignment module, resulting in improvements of 25.9% in Top-1 and 10.6% in Top-5 retrieval accuracy on the THINGS-EEG dataset.
Visual stimuli reconstruction from EEG remains challenging due to fidelity loss and representation shift. We propose CognitionCapturerPro, an enhanced framework that integrates EEG with multi-modal priors (images, text, depth, and edges) via collaborative training. Our core contributions include an uncertainty-weighted similarity scoring mechanism to quantify modality-specific fidelity and a fusion encoder for integrating shared representations. By employing a simplified alignment module and a pre-trained diffusion model, our method significantly outperforms the original CognitionCapturer on the THINGS-EEG dataset, improving Top-1 and Top-5 retrieval accuracy by 25.9% and 10.6%, respectively. Code is available at: https://github.com/XiaoZhangYES/CognitionCapturerPro.