Brain-language fusion enables interactive neural readout and in-silico experimentation
This work addresses the need for interactive brain-computer interfaces, offering a novel approach to neural decoding that could benefit neuroscience and AI, though it is incremental in extending LLMs to neural data.
The paper tackles the problem of static neural decoding by introducing CorText, a framework that integrates neural activity into an LLM's latent space, enabling natural language interaction with brain data; it achieves accurate image captioning and zero-shot generalization beyond training categories.
Large language models (LLMs) have revolutionized human-machine interaction, and have been extended by embedding diverse modalities such as images into a shared language space. Yet, neural decoding has remained constrained by static, non-interactive methods. We introduce CorText, a framework that integrates neural activity directly into the latent space of an LLM, enabling open-ended, natural language interaction with brain data. Trained on fMRI data recorded during viewing of natural scenes, CorText generates accurate image captions and can answer more detailed questions better than controls, while having access to neural data only. We showcase that CorText achieves zero-shot generalization beyond semantic categories seen during training. Furthermore, we present a counterfactual analysis that emulates in-silico cortical microstimulation. These advances mark a shift from passive decoding toward generative, flexible interfaces between brain activity and language.