CLLGNESep 27, 2016

Multi-task Recurrent Model for True Multilingual Speech Recognition

arXiv:1609.08337v12 citations
Originality Incremental advance
AI Analysis

This work addresses a specific problem in multilingual speech recognition for scenarios where multiple languages are allowed, representing an incremental improvement over existing multi-task approaches.

The paper tackles the performance degradation in true multilingual speech recognition systems due to language competition by proposing a multi-task recurrent model that integrates language recognition to inform the ASR component, resulting in improved performance on an English-Chinese bilingual task.

Research on multilingual speech recognition remains attractive yet challenging. Recent studies focus on learning shared structures under the multi-task paradigm, in particular a feature sharing structure. This approach has been found effective to improve performance on each individual language. However, this approach is only useful when the deployed system supports just one language. In a true multilingual scenario where multiple languages are allowed, performance will be significantly reduced due to the competition among languages in the decoding space. This paper presents a multi-task recurrent model that involves a multilingual speech recognition (ASR) component and a language recognition (LR) component, and the ASR component is informed of the language information by the LR component, leading to a language-aware recognition. We tested the approach on an English-Chinese bilingual recognition task. The results show that the proposed multi-task recurrent model can improve performance of multilingual recognition systems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes