CLDec 13, 2016

Performance Improvements of Probabilistic Transcript-adapted ASR with Recurrent Neural Network and Language-specific Constraints

arXiv:1612.03991v1
Originality Incremental advance
AI Analysis

This work addresses the challenge of adapting ASR systems for non-native speakers in specific languages, representing an incremental improvement over prior methods.

The paper tackled the problem of improving probabilistic transcriptions for non-native speakers in cross-lingual ASR by refining them with a recurrent neural network-based noisy-channel model and language-specific constraints, resulting in reductions of phone error rate by 7% and 9% respectively.

Mismatched transcriptions have been proposed as a mean to acquire probabilistic transcriptions from non-native speakers of a language.Prior work has demonstrated the value of these transcriptions by successfully adapting cross-lingual ASR systems for different tar-get languages. In this work, we describe two techniques to refine these probabilistic transcriptions: a noisy-channel model of non-native phone misperception is trained using a recurrent neural net-work, and decoded using minimally-resourced language-dependent pronunciation constraints. Both innovations improve quality of the transcript, and both innovations reduce phone error rate of a trainedASR, by 7% and 9% respectively

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes