CLMay 16, 2016

Log-linear Combinations of Monolingual and Bilingual Neural Machine Translation Models for Automatic Post-Editing

Marcin Junczys-Dowmunt, Roman Grundkiewicz

arXiv:1605.04800v217.9110 citations

Originality Incremental advance

AI Analysis

This work addresses the problem of improving machine translation output through post-editing for translation practitioners, representing an incremental advance by combining existing methods with artificial data generation.

The paper tackles the Automatic Post-Editing (APE) problem by applying neural translation models in a log-linear combination with a string-matching penalty, achieving improvements of -3.2% TER and +5.5% BLEU over the baseline and outperforming other systems in the WMT 2016 shared task.

This paper describes the submission of the AMU (Adam Mickiewicz University) team to the Automatic Post-Editing (APE) task of WMT 2016. We explore the application of neural translation models to the APE problem and achieve good results by treating different models as components in a log-linear model, allowing for multiple inputs (the MT-output and the source) that are decoded to the same target language (post-edited translations). A simple string-matching penalty integrated within the log-linear model is used to control for higher faithfulness with regard to the raw machine translation output. To overcome the problem of too little training data, we generate large amounts of artificial data. Our submission improves over the uncorrected baseline on the unseen test set by -3.2\% TER and +5.5\% BLEU and outperforms any other system submitted to the shared-task by a large margin.

View on arXiv PDF

Similar