LGCLSDASMLApr 8, 2019

SPEAK YOUR MIND! Towards Imagined Speech Recognition With Hierarchical Deep Learning

arXiv:1904.05746v156 citations
Originality Highly original
AI Analysis

This work addresses the challenge of enabling vocal communication for people with speech impairments through brain-computer interfaces, representing a novel method rather than an incremental improvement.

The researchers tackled the problem of decoding imagined speech from brain signals by developing a hierarchical deep learning BCI system that classifies 11 speech tokens, achieving 83.42% accuracy for phonological classification and 53.36% for token identification.

Speech-related Brain Computer Interface (BCI) technologies provide effective vocal communication strategies for controlling devices through speech commands interpreted from brain signals. In order to infer imagined speech from active thoughts, we propose a novel hierarchical deep learning BCI system for subject-independent classification of 11 speech tokens including phonemes and words. Our novel approach exploits predicted articulatory information of six phonological categories (e.g., nasal, bilabial) as an intermediate step for classifying the phonemes and words, thereby finding discriminative signal responsible for natural speech synthesis. The proposed network is composed of hierarchical combination of spatial and temporal CNN cascaded with a deep autoencoder. Our best models on the KARA database achieve an average accuracy of 83.42% across the six different binary phonological classification tasks, and 53.36% for the individual token identification task, significantly outperforming our baselines. Ultimately, our work suggests the possible existence of a brain imagery footprint for the underlying articulatory movement related to different sounds that can be used to aid imagined speech decoding.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes