CLNov 8, 2019

Pretrained Language Models for Document-Level Neural Machine Translation

arXiv:1911.03110v121 citations
Originality Incremental advance
AI Analysis

This work addresses a bottleneck in document-level NMT for translation tasks, offering incremental improvements over existing methods.

The paper tackles the problem of degraded performance in document-level neural machine translation when using large contexts by employing pretrained language models and context manipulation methods, achieving significantly better results than previous systems on IWSLT datasets with multiple language pairs.

Previous work on document-level NMT usually focuses on limited contexts because of degraded performance on larger contexts. In this paper, we investigate on using large contexts with three main contributions: (1) Different from previous work which pertrained models on large-scale sentence-level parallel corpora, we use pretrained language models, specifically BERT, which are trained on monolingual documents; (2) We propose context manipulation methods to control the influence of large contexts, which lead to comparable results on systems using small and large contexts; (3) We introduce a multi-task training for regularization to avoid models overfitting our training corpora, which further improves our systems together with a deeper encoder. Experiments are conducted on the widely used IWSLT data sets with three language pairs, i.e., Chinese--English, French--English and Spanish--English. Results show that our systems are significantly better than three previously reported document-level systems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes