CVMar 7

Aligning What EEG Can See: Structural Representations for Brain-Vision Matching

Jingyi Tang, Shuai Jiang, Fei Su, Zhicheng Zhao

arXiv:2603.07077v1

Predicted impact top 38% in CV · last 90 daysOriginality Highly original

AI Analysis

This work provides a significant improvement in visual decoding from EEG, which is crucial for advancing non-invasive brain-computer interfaces.

This paper addresses the problem of cross-modal information mismatch when decoding visual information from EEG by aligning EEG signals with intermediate visual layers instead of final-layer semantic embeddings. Their method achieves 84.6% accuracy (+21.4%) on zero-shot visual decoding on the THINGS-EEG dataset and up to a 129.8% performance gain across diverse EEG baselines.

Visual decoding from electroencephalography (EEG) has emerged as a highly promising avenue for non-invasive brain-computer interfaces (BCIs). Existing EEG-based decoding methods predominantly align brain signals with the final-layer semantic embeddings of deep visual models. However, relying on these highly abstracted embeddings inevitably leads to severe cross-modal information mismatch. In this work, we introduce the concept of Neural Visibility and accordingly propose the EEG-Visible Layer Selection Strategy, aligning EEG signals with intermediate visual layers to minimize this mismatch. Furthermore, to accommodate the multi-stage nature of human visual processing, we propose a novel Hierarchically Complementary Fusion (HCF) framework that jointly integrates visual representations from different hierarchical levels. Extensive experiments demonstrate that our method achieves state-of-the-art performance, reaching an 84.6% accuracy (+21.4%) on zero-shot visual decoding on the THINGS-EEG dataset. Moreover, our method achieves up to a 129.8% performance gain across diverse EEG baselines, demonstrating its robust generalizability.

View on arXiv PDF

Similar