CLMay 31, 2017

Learning When to Attend for Neural Machine Translation

arXiv:1705.11160v16 citations
Originality Highly original
AI Analysis

This addresses a specific inefficiency in neural machine translation for translation tasks, though it is incremental as it builds on existing attention models.

The paper tackled the problem that standard attention mechanisms in neural machine translation always attend to source words, even when target words have no corresponding source words, by proposing a novel attention model that decides when to attend, resulting in a 0.8 BLEU score improvement on NIST Chinese-English translation tasks.

In the past few years, attention mechanisms have become an indispensable component of end-to-end neural machine translation models. However, previous attention models always refer to some source words when predicting a target word, which contradicts with the fact that some target words have no corresponding source words. Motivated by this observation, we propose a novel attention model that has the capability of determining when a decoder should attend to source words and when it should not. Experimental results on NIST Chinese-English translation tasks show that the new model achieves an improvement of 0.8 BLEU score over a state-of-the-art baseline.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes