CLAug 7, 2017

Memory-augmented Neural Machine Translation

arXiv:1708.02005v11119 citations
Originality Incremental advance
AI Analysis

This addresses translation quality issues for users of NMT systems, particularly with rare words, but is incremental as it builds on existing NMT and statistical methods.

The paper tackled the problem of neural machine translation struggling with infrequent words and word pairs by introducing a memory-augmented architecture that stores translation knowledge, resulting in BLEU score improvements of 9.0 and 2.7 points on two Chinese-English tasks and better out-of-vocabulary handling.

Neural machine translation (NMT) has achieved notable success in recent times, however it is also widely recognized that this approach has limitations with handling infrequent words and word pairs. This paper presents a novel memory-augmented NMT (M-NMT) architecture, which stores knowledge about how words (usually infrequently encountered ones) should be translated in a memory and then utilizes them to assist the neural model. We use this memory mechanism to combine the knowledge learned from a conventional statistical machine translation system and the rules learned by an NMT system, and also propose a solution for out-of-vocabulary (OOV) words based on this framework. Our experiments on two Chinese-English translation tasks demonstrated that the M-NMT architecture outperformed the NMT baseline by $9.0$ and $2.7$ BLEU points on the two tasks, respectively. Additionally, we found this architecture resulted in a much more effective OOV treatment compared to competitive methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes