CLJun 13, 2017

An Exploration of Neural Sequence-to-Sequence Architectures for Automatic Post-Editing

arXiv:1706.04138v21101 citations
Originality Incremental advance
AI Analysis

This work addresses improving machine translation quality through post-editing, but it is incremental as it builds on existing neural methods for a specific task.

The paper tackled automatic post-editing of machine translation output by exploring neural sequence-to-sequence architectures, and found that dual-attention models incorporating all available data improved on the best WMT-2016 shared task system and other published results.

In this work, we explore multiple neural architectures adapted for the task of automatic post-editing of machine translation output. We focus on neural end-to-end models that combine both inputs $mt$ (raw MT output) and $src$ (source language input) in a single neural architecture, modeling $\{mt, src\} \rightarrow pe$ directly. Apart from that, we investigate the influence of hard-attention models which seem to be well-suited for monolingual tasks, as well as combinations of both ideas. We report results on data sets provided during the WMT-2016 shared task on automatic post-editing and can demonstrate that dual-attention models that incorporate all available data in the APE scenario in a single model improve on the best shared task system and on all other published results after the shared task. Dual-attention models that are combined with hard attention remain competitive despite applying fewer changes to the input.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes