CLAug 19, 2021

MvSR-NAT: Multi-view Subset Regularization for Non-Autoregressive Machine Translation

arXiv:2108.08447v113 citations
Originality Incremental advance
AI Analysis

This work addresses the efficiency and accuracy challenges in machine translation for NLP researchers and practitioners, representing an incremental improvement over existing non-autoregressive methods.

The paper tackles the problem of improving non-autoregressive machine translation by introducing Multi-view Subset Regularization (MvSR), a novel regularization method that enhances consistency in predictions, achieving BLEU gains of 0.36-1.14 over previous NAT models and reducing the gap to Transformer baselines to 0.01-0.44 BLEU on small datasets.

Conditional masked language models (CMLM) have shown impressive progress in non-autoregressive machine translation (NAT). They learn the conditional translation model by predicting the random masked subset in the target sentence. Based on the CMLM framework, we introduce Multi-view Subset Regularization (MvSR), a novel regularization method to improve the performance of the NAT model. Specifically, MvSR consists of two parts: (1) \textit{shared mask consistency}: we forward the same target with different mask strategies, and encourage the predictions of shared mask positions to be consistent with each other. (2) \textit{model consistency}, we maintain an exponential moving average of the model weights, and enforce the predictions to be consistent between the average model and the online model. Without changing the CMLM-based architecture, our approach achieves remarkable performance on three public benchmarks with 0.36-1.14 BLEU gains over previous NAT models. Moreover, compared with the stronger Transformer baseline, we reduce the gap to 0.01-0.44 BLEU scores on small datasets (WMT16 RO$\leftrightarrow$EN and IWSLT DE$\rightarrow$EN).

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes