CLLGApr 16, 2020

Non-Autoregressive Machine Translation with Latent Alignments

arXiv:2004.07437v31057 citations
Originality Highly original
AI Analysis

This work addresses the speed-accuracy trade-off in machine translation for NLP applications, offering simpler models that avoid common bottlenecks like target length prediction.

The paper tackles non-autoregressive machine translation by introducing CTC and Imputer models that use latent alignments, achieving state-of-the-art results with single-step CTC at 25.7 BLEU and Imputer matching autoregressive performance at 28.0 BLEU on WMT'14 En→De.

This paper presents two strong methods, CTC and Imputer, for non-autoregressive machine translation that model latent alignments with dynamic programming. We revisit CTC for machine translation and demonstrate that a simple CTC model can achieve state-of-the-art for single-step non-autoregressive machine translation, contrary to what prior work indicates. In addition, we adapt the Imputer model for non-autoregressive machine translation and demonstrate that Imputer with just 4 generation steps can match the performance of an autoregressive Transformer baseline. Our latent alignment models are simpler than many existing non-autoregressive translation baselines; for example, we do not require target length prediction or re-scoring with an autoregressive model. On the competitive WMT'14 En$\rightarrow$De task, our CTC model achieves 25.7 BLEU with a single generation step, while Imputer achieves 27.5 BLEU with 2 generation steps, and 28.0 BLEU with 4 generation steps. This compares favourably to the autoregressive Transformer baseline at 27.8 BLEU.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes