Language Reconstruction with Brain Predictive Coding from fMRI Data
This work addresses the challenge of improving language reconstruction from brain signals for applications in brain-computer interfaces and neuroscience, representing an incremental advance by integrating predictive coding into existing methods.
The paper tackled the problem of reconstructing continuous language from fMRI brain signals by incorporating predictive coding theory, proposing a model that jointly models neural decoding and brain prediction, and achieved a state-of-the-art BLEU-1 score of 27.8% on the Narratives dataset.
Many recent studies have shown that the perception of speech can be decoded from brain signals and subsequently reconstructed as continuous language. However, there is a lack of neurological basis for how the semantic information embedded within brain signals can be used more effectively to guide language reconstruction. The theory of predictive coding suggests that human brain naturally engages in continuously predicting future word representations that span multiple timescales. This implies that the decoding of brain signals could potentially be associated with a predictable future. To explore the predictive coding theory within the context of language reconstruction, this paper proposes a novel model \textsc{PredFT} for jointly modeling neural decoding and brain prediction. It consists of a main decoding network for language reconstruction and a side network for predictive coding. The side network obtains brain predictive coding representation from related brain regions of interest with a multi-head self-attention module. This representation is fused into the main decoding network with cross-attention to facilitate the language models' generation process. Experiments are conducted on the largest naturalistic language comprehension fMRI dataset Narratives. \textsc{PredFT} achieves current state-of-the-art decoding performance with a maximum BLEU-1 score of $27.8\%$.