CLOct 18, 2020

Rethinking Document-level Neural Machine Translation

arXiv:2010.08961v2650 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of improving translation quality for long documents, but it is incremental as it revisits and optimizes an existing model rather than introducing a new one.

The paper investigates whether existing Transformer models have sufficient capacity for document-level machine translation, finding that with proper training techniques, the original Transformer can achieve strong results on documents up to 2000 words, outperforming sentence-level models and previous methods across multiple metrics.

This paper does not aim at introducing a novel model for document-level neural machine translation. Instead, we head back to the original Transformer model and hope to answer the following question: Is the capacity of current models strong enough for document-level translation? Interestingly, we observe that the original Transformer with appropriate training techniques can achieve strong results for document translation, even with a length of 2000 words. We evaluate this model and several recent approaches on nine document-level datasets and two sentence-level datasets across six languages. Experiments show that document-level Transformer models outperforms sentence-level ones and many previous methods in a comprehensive set of metrics, including BLEU, four lexical indices, three newly proposed assistant linguistic indicators, and human evaluation.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes