CLASSep 25, 2019

Breaking the Data Barrier: Towards Robust Speech Translation via Adversarial Stability Training

arXiv:1909.11430v3650 citations
Originality Incremental advance
AI Analysis

This addresses the issue of error propagation in speech translation for users of such systems, representing an incremental improvement over existing methods.

The paper tackles the problem of speech recognition errors degrading machine translation performance in speech translation systems by proposing a training architecture that enhances robustness against such errors, achieving up to 2.83 BLEU improvement on noisy ASR output while maintaining performance on clean text.

In a pipeline speech translation system, automatic speech recognition (ASR) system will transmit errors in recognition to the downstream machine translation (MT) system. A standard machine translation system is usually trained on parallel corpus composed of clean text and will perform poorly on text with recognition noise, a gap well known in speech translation community. In this paper, we propose a training architecture which aims at making a neural machine translation model more robust against speech recognition errors. Our approach addresses the encoder and the decoder simultaneously using adversarial learning and data augmentation, respectively. Experimental results on IWSLT2018 speech translation task show that our approach can bridge the gap between the ASR output and the MT input, outperforms the baseline by up to 2.83 BLEU on noisy ASR output, while maintaining close performance on clean text.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes