CLLGMay 2, 2020

Improving Non-autoregressive Neural Machine Translation with Monolingual Data

arXiv:2005.00932v31012 citations
AI Analysis

This work addresses the challenge of enhancing translation efficiency for machine learning practitioners by incrementally improving non-autoregressive models with monolingual data.

The authors tackled the problem of improving non-autoregressive neural machine translation by leveraging monolingual data to transfer generalization ability from an autoregressive teacher model, resulting in performance approaching the teacher's level and reducing overfitting on WMT14 En-De and WMT16 En-Ro tasks.

Non-autoregressive (NAR) neural machine translation is usually done via knowledge distillation from an autoregressive (AR) model. Under this framework, we leverage large monolingual corpora to improve the NAR model's performance, with the goal of transferring the AR model's generalization ability while preventing overfitting. On top of a strong NAR baseline, our experimental results on the WMT14 En-De and WMT16 En-Ro news translation tasks confirm that monolingual data augmentation consistently improves the performance of the NAR model to approach the teacher AR model's performance, yields comparable or better results than the best non-iterative NAR methods in the literature and helps reduce overfitting in the training process.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes