Are EEG-to-Text Models Working?
It addresses evaluation reliability for EEG-to-Text systems, which is crucial for developing robust brain-computer interfaces, but is incremental as it focuses on methodology rather than new models.
This paper critically analyzes EEG-to-Text translation models, identifying that previous studies used flawed evaluation methods like implicit teacher-forcing and lacked noise benchmarks, revealing that model performance on noise can be comparable to EEG data.
This work critically analyzes existing models for open-vocabulary EEG-to-Text translation. We identify a crucial limitation: previous studies often employed implicit teacher-forcing during evaluation, artificially inflating performance metrics. Additionally, they lacked a critical benchmark - comparing model performance on pure noise inputs. We propose a methodology to differentiate between models that truly learn from EEG signals and those that simply memorize training data. Our analysis reveals that model performance on noise data can be comparable to that on EEG data. These findings highlight the need for stricter evaluation practices in EEG-to-Text research, emphasizing transparent reporting and rigorous benchmarking with noise inputs. This approach will lead to more reliable assessments of model capabilities and pave the way for robust EEG-to-Text communication systems. Code is available at https://github.com/NeuSpeech/EEG-To-Text