CLOct 9, 2017

What does Attention in Neural Machine Translation Pay Attention to?

arXiv:1710.03348v11144 citations
Originality Synthesis-oriented
AI Analysis

This work addresses a fundamental question in machine translation for researchers, providing insights into model interpretability, though it is incremental as it builds on existing attention models.

The paper investigates whether attention mechanisms in neural machine translation function similarly to traditional word alignment, finding that attention captures additional useful information beyond simple alignment.

Attention in neural machine translation provides the possibility to encode relevant parts of the source sentence at each translation step. As a result, attention is considered to be an alignment model as well. However, there is no work that specifically studies attention and provides analysis of what is being learned by attention models. Thus, the question still remains that how attention is similar or different from the traditional alignment. In this paper, we provide detailed analysis of attention and compare it to traditional alignment. We answer the question of whether attention is only capable of modelling translational equivalent or it captures more information. We show that attention is different from alignment in some cases and is capturing useful information other than alignments.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes