CLSep 1, 2018

Beyond Error Propagation in Neural Machine Translation: Characteristics of Language Also Matter

arXiv:1809.00120v21113 citations
Originality Incremental advance
AI Analysis

This work provides insights for improving translation models by highlighting language-specific effects, but it is incremental as it refines existing understanding of accuracy drop.

The paper investigates the accuracy drop problem in neural machine translation, finding that while error propagation contributes, language characteristics like branching direction play a more significant role, with left-branching languages showing higher accuracy on the right part and right-branching languages on the left part.

Neural machine translation usually adopts autoregressive models and suffers from exposure bias as well as the consequent error propagation problem. Many previous works have discussed the relationship between error propagation and the \emph{accuracy drop} (i.e., the left part of the translated sentence is often better than its right part in left-to-right decoding models) problem. In this paper, we conduct a series of analyses to deeply understand this problem and get several interesting findings. (1) The role of error propagation on accuracy drop is overstated in the literature, although it indeed contributes to the accuracy drop problem. (2) Characteristics of a language play a more important role in causing the accuracy drop: the left part of the translation result in a right-branching language (e.g., English) is more likely to be more accurate than its right part, while the right part is more accurate for a left-branching language (e.g., Japanese). Our discoveries are confirmed on different model structures including Transformer and RNN, and in other sequence generation tasks such as text summarization.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes