CLOct 14, 2021

Non-Autoregressive Translation with Layer-Wise Prediction and Deep Supervision

arXiv:2110.07515v165 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of slow inference in machine translation for applications requiring real-time processing, representing an incremental improvement over existing non-autoregressive methods.

The paper tackled the trade-off between inference efficiency and translation quality in neural machine translation by proposing DSLP, a non-autoregressive Transformer with deep supervision and layer-wise predictions. Results showed that their best variant outperformed autoregressive models on three out of four translation tasks while being 14.8 times more efficient in inference.

How do we perform efficient inference while retaining high translation quality? Existing neural machine translation models, such as Transformer, achieve high performance, but they decode words one by one, which is inefficient. Recent non-autoregressive translation models speed up the inference, but their quality is still inferior. In this work, we propose DSLP, a highly efficient and high-performance model for machine translation. The key insight is to train a non-autoregressive Transformer with Deep Supervision and feed additional Layer-wise Predictions. We conducted extensive experiments on four translation tasks (both directions of WMT'14 EN-DE and WMT'16 EN-RO). Results show that our approach consistently improves the BLEU scores compared with respective base models. Specifically, our best variant outperforms the autoregressive model on three translation tasks, while being 14.8 times more efficient in inference.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes