LGAIHCOct 15, 2025

NeuroRVQ: Multi-Scale EEG Tokenization for Generative Large Brainwave Models

arXiv:2510.13068v16 citationsh-index: 81
Originality Incremental advance
AI Analysis

This work addresses a bottleneck in EEG representation learning for researchers and practitioners in neuroscience and AI, offering incremental improvements in tokenization methods.

The paper tackles the problem of EEG signal tokenization for large brainwave models by introducing NeuroRVQ, a codebook-based tokenizer that integrates multi-scale feature extraction and hierarchical residual vector quantization, achieving lower reconstruction error and outperforming existing models on downstream tasks.

Electroencephalography (EEG) captures neural activity across multiple temporal and spectral scales, yielding signals that are rich but complex for representation learning. Recently, EEG foundation models trained to predict masked signal-tokens have shown promise for learning generalizable representations. However, their performance is hindered by their signal tokenization modules. Existing neural tokenizers fail to preserve high-frequency dynamics, limiting their ability to reconstruct EEG signals with high fidelity. We introduce NeuroRVQ, a scalable Large Brainwave Model (LBM) centered on a codebook-based tokenizer. Our tokenizer integrates: (i) multi-scale feature extraction modules that capture the full frequency neural spectrum; (ii) hierarchical residual vector quantization (RVQ) codebooks for high-resolution encoding; and, (iii) an EEG signal phase- and amplitude-aware loss function for efficient training. This design enables efficient EEG compression while supporting accurate reconstruction across all frequency bands, leading to robust generative masked modeling. Our empirical results demonstrate that NeuroRVQ achieves lower reconstruction error and outperforms existing LBMs on a variety of downstream tasks. More broadly, NeuroRVQ tokenizer establishes a strong prior for codebook-based general-purpose brainwave models, enabling advances in neural decoding, generative modeling and multimodal biosignal integration.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes