ASSDNov 5, 2018

Manner of Articulation Detection using Connectionist Temporal Classification to Improve Automatic Speech Recognition Performance

arXiv:1811.01644v12 citationsHas Code
Originality Incremental advance
AI Analysis

This work addresses speech recognition accuracy for users by incrementally enhancing CTC-based methods with manner of articulation knowledge.

The paper tackled the problem of improving automatic speech recognition by detecting manner of articulations without phoneme alignment using an end-to-end CTC-based model, resulting in outperformance over baseline character CTC on datasets like AN4, LibriSpeech, and TEDLIUM-2.

Conventionally, the manner of articulations in speech signal are derived using discriminative signal processing techniques or deep learning approaches. However, training such complex systems involves feature extraction, phoneme force alignment and deep neural network training. In our work, we initially detect the manner of articulations without phoneme alignment using an end-to-end manner of articulation modeling based on connectionist temporal classification (CTC). The manner of articulation knowledge is deployed in the conventional character CTC path to regenerate the new character CTC path. The modified manner based character CTC is evaluated on open source speech datasets such as AN4, LibriSpeech and TEDLIUM-2 and it outperforms over the baseline character CTC.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes