DELTA: Language Diffusion-based EEG-to-Text Architecture
This work addresses the challenge of generating text from EEG signals, which could enable reliable communication tools for individuals with disabilities, though it appears incremental as it builds on existing diffusion and tokenization methods.
The paper tackles the problem of EEG-to-text conversion by introducing DELTA, which uses a Residual Vector Quantization tokenizer and a masked language diffusion model to reduce noise and error accumulation, achieving improvements of up to 5.37 points in semantic alignment and BLEU-1 of 21.9 on the ZuCo dataset.
Electroencephalogram (EEG)-to-text remains challenging due to high-dimensional noise, subject variability, and error accumulation in autoregressive decoding. We introduce DELTA, which pairs a Residual Vector Quantization (RVQ) EEG tokenizer with a masked language diffusion model (LLaDA). RVQ discretizes continuous EEG into multi-layer tokens to reduce noise and individual differences, while LLaDA reconstructs sentences via non-sequential denoising. On ZuCo, DELTA improves semantic alignment by up to 5.37 points over autoregressive baselines, achieving BLEU-1 21.9 and ROUGE-1 F 17.2 under word-level conditions. These results enable reliable text generation from small EEG-text datasets and point toward scalable multimodal EEG-language models.