CLNov 16, 2018

Beam Search Decoding using Manner of Articulation Detection Knowledge Derived from Connectionist Temporal Classification

arXiv:1811.07720v1Has Code
Originality Incremental advance
AI Analysis

This work addresses speech recognition accuracy for users by incrementally enhancing decoding with articulation knowledge.

The paper tackled the problem of speech recognition by integrating manner of articulation detection into beam search decoding without needing phoneme alignments, resulting in improved performance on AN4 and LibriSpeech datasets.

Manner of articulation detection using deep neural networks require a priori knowledge of the attribute discriminative features or the decent phoneme alignments. However generating an appropriate phoneme alignment is complex and its performance depends on the choice of optimal number of senones, Gaussians, etc. In the first part of our work, we exploit the manner of articulation detection using connectionist temporal classification (CTC) which doesn't need any phoneme alignment. Later we modify the state-of-the-art character based posteriors generated by CTC using the manner of articulation CTC detector. Beam search decoding is performed on the modified posteriors and it's impact on open source datasets such as AN4 and LibriSpeech is observed.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes