SD CL NCDec 17, 2025

From Minutes to Days: Scaling Intracranial Speech Decoding with Supervised Pretraining

Linnea Evanson, Mingfang, Zhang, Hubert Banville, Saarang Panchavati, Pierre Bourdillon, Jean-Rémi King

arXiv:2512.15830v1h-index: 13

Originality Incremental advance

AI Analysis

This work addresses the challenge of limited neural recordings for speech decoding in patients, offering a scalable approach for real-life and controlled settings, though it is incremental in improving data utilization.

The researchers tackled the problem of decoding speech from brain activity by leveraging week-long intracranial and audio recordings to increase training data by over two orders of magnitude, resulting in a contrastive learning model that substantially outperforms models trained on classic experimental data with log-linear scaling gains.

Decoding speech from brain activity has typically relied on limited neural recordings collected during short and highly controlled experiments. Here, we introduce a framework to leverage week-long intracranial and audio recordings from patients undergoing clinical monitoring, effectively increasing the training dataset size by over two orders of magnitude. With this pretraining, our contrastive learning model substantially outperforms models trained solely on classic experimental data, with gains that scale log-linearly with dataset size. Analysis of the learned representations reveals that, while brain activity represents speech features, its global structure largely drifts across days, highlighting the need for models that explicitly account for cross-day variability. Overall, our approach opens a scalable path toward decoding and modeling brain representations in both real-life and controlled task settings.

View on arXiv PDF

Similar