HC SDMar 14

LightBeam: An Accurate and Memory-Efficient CTC Decoder for Speech Neuroprostheses

Ebrahim Feghhi, Junlin Hu, Nima Hadidi, Jonathan C. Kao

arXiv:2603.1400289.6h-index: 3Has Code

AI Analysis

This addresses a critical bottleneck in making speech neuroprostheses more accessible for patients with dysarthria and anarthria, though it is an incremental improvement on existing decoder methods.

The paper tackles the problem of excessive memory requirements in CTC decoders for speech neuroprostheses, which limited accessibility for patients and researchers, and proposes LightBeam, a non-WFST based decoder that reduces RAM usage from ~320 GB to ~10 GB while achieving state-of-the-art performance on Brain-to-Text '24 and '25 benchmarks.

A promising pathway for restoring communication in patients with dysarthria and anarthria is speech neuroprostheses, which directly decode speech from cortical neural activity. Two benchmarks, Brain-to-Text '24 and '25, released intracranial recordings from patients with dysarthria along with a baseline algorithm trained with Connectionist Temporal Classification (CTC). Despite significant innovation on these benchmarks, all leading published prior work relies on a WFST-based CTC decoder that requires ${\sim}$320 GB of RAM. These memory requirements limit accessibility for both patients and researchers. Here, we propose LightBeam, a non-WFST based CTC decoder that requires only ${\sim}$10 GB of RAM and achieves state-of-the-art performance on both benchmarks. LightBeam achieves this by integrating an LLM into the beam-search process via delayed fusion, obviating the prior need for using a large N-gram LM. LightBeam is implemented in Python and is open-source.

View on arXiv PDF Code

Similar