CLSep 3, 2019

Context-Aware Monolingual Repair for Neural Machine Translation

arXiv:1909.01383v21032 citations
Originality Incremental advance
AI Analysis

This addresses the issue of translation consistency for users of NMT systems, though it is incremental as it builds on existing sentence-level methods.

The paper tackles the problem of inconsistent translations in neural machine translation when sentences are considered in context, proposing a monolingual DocRepair model that corrects these inconsistencies by post-editing sentence-level translations, resulting in large improvements in contextual phenomena and BLEU scores.

Modern sentence-level NMT systems often produce plausible translations of isolated sentences. However, when put in context, these translations may end up being inconsistent with each other. We propose a monolingual DocRepair model to correct inconsistencies between sentence-level translations. DocRepair performs automatic post-editing on a sequence of sentence-level translations, refining translations of sentences in context of each other. For training, the DocRepair model requires only monolingual document-level data in the target language. It is trained as a monolingual sequence-to-sequence model that maps inconsistent groups of sentences into consistent ones. The consistent groups come from the original training data; the inconsistent groups are obtained by sampling round-trip translations for each isolated sentence. We show that this approach successfully imitates inconsistencies we aim to fix: using contrastive evaluation, we show large improvements in the translation of several contextual phenomena in an English-Russian translation task, as well as improvements in the BLEU score. We also conduct a human evaluation and show a strong preference of the annotators to corrected translations over the baseline ones. Moreover, we analyze which discourse phenomena are hard to capture using monolingual data only.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes