CLNov 2, 2018

Adversarial Training of End-to-end Speech Recognition Using a Criticizing Language Model

arXiv:1811.00787v147 citations
Originality Incremental advance
AI Analysis

This method addresses the problem of limited paired speech-text data for speech recognition researchers, though it appears incremental as it builds on existing adversarial training and language modeling techniques.

The paper tackled improving end-to-end speech recognition by using adversarial training with a criticizing language model, which allowed leveraging unpaired text data and resulted in consistent performance gains across different scenarios.

In this paper we proposed a novel Adversarial Training (AT) approach for end-to-end speech recognition using a Criticizing Language Model (CLM). In this way the CLM and the automatic speech recognition (ASR) model can challenge and learn from each other iteratively to improve the performance. Since the CLM only takes the text as input, huge quantities of unpaired text data can be utilized in this approach within end-to-end training. Moreover, AT can be applied to any end-to-end ASR model using any deep-learning-based language modeling frameworks, and compatible with any existing end-to-end decoding method. Initial results with an example experimental setup demonstrated the proposed approach is able to gain consistent improvements efficiently from auxiliary text data under different scenarios.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes