CLDec 31, 2020

Fully Non-autoregressive Neural Machine Translation: Tricks of the Trade

arXiv:2012.15833v1744 citations
AI Analysis

This work aims to improve the translation quality of fully non-autoregressive neural machine translation models for users who require high-speed inference without significant quality compromise, representing an incremental improvement in the field.

This paper addresses the quality degradation in fully non-autoregressive neural machine translation (NAT) while maintaining its low inference latency. By combining and modifying existing techniques, their proposed system achieves 27.49 BLEU points on WMT14 En-De, which is comparable to autoregressive and iterative NAT systems, with a 16.5X speedup.

Fully non-autoregressive neural machine translation (NAT) is proposed to simultaneously predict tokens with single forward of neural networks, which significantly reduces the inference latency at the expense of quality drop compared to the Transformer baseline. In this work, we target on closing the performance gap while maintaining the latency advantage. We first inspect the fundamental issues of fully NAT models, and adopt dependency reduction in the learning space of output tokens as the basic guidance. Then, we revisit methods in four different aspects that have been proven effective for improving NAT models, and carefully combine these techniques with necessary modifications. Our extensive experiments on three translation benchmarks show that the proposed system achieves the new state-of-the-art results for fully NAT models, and obtains comparable performance with the autoregressive and iterative NAT systems. For instance, one of the proposed models achieves 27.49 BLEU points on WMT14 En-De with approximately 16.5X speed up at inference time.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes