CLNov 27, 2017

Modeling Past and Future for Neural Machine Translation

arXiv:1711.09502v21120 citations
Originality Incremental advance
AI Analysis

This addresses a specific bottleneck in neural machine translation for researchers and practitioners, offering an incremental improvement over existing methods.

The paper tackled the problem of neural machine translation systems not explicitly modeling translated and untranslated content during decoding, and the result was a novel mechanism that significantly improved translation performance across multiple language pairs, outperforming conventional coverage models in translation quality and alignment error rate.

Existing neural machine translation systems do not explicitly model what has been translated and what has not during the decoding phase. To address this problem, we propose a novel mechanism that separates the source information into two parts: translated Past contents and untranslated Future contents, which are modeled by two additional recurrent layers. The Past and Future contents are fed to both the attention model and the decoder states, which offers NMT systems the knowledge of translated and untranslated contents. Experimental results show that the proposed approach significantly improves translation performance in Chinese-English, German-English and English-German translation tasks. Specifically, the proposed model outperforms the conventional coverage model in both of the translation quality and the alignment error rate.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes