CLAISep 13, 2021

Attention Weights in Transformer NMT Fail Aligning Words Between Sequences but Largely Explain Model Predictions

arXiv:2109.05853v1663 citations
Originality Incremental advance
AI Analysis

This addresses interpretability issues in neural machine translation for researchers and practitioners, but it is incremental as it builds on existing attention analysis.

The paper tackles the problem of attention weights in Transformer NMT failing to align words between sequences, showing they rely on uninformative tokens but still largely explain model predictions, and proposes methods that reduce word alignment error rate compared to standard induced alignments.

This work proposes an extensive analysis of the Transformer architecture in the Neural Machine Translation (NMT) setting. Focusing on the encoder-decoder attention mechanism, we prove that attention weights systematically make alignment errors by relying mainly on uninformative tokens from the source sequence. However, we observe that NMT models assign attention to these tokens to regulate the contribution in the prediction of the two contexts, the source and the prefix of the target sequence. We provide evidence about the influence of wrong alignments on the model behavior, demonstrating that the encoder-decoder attention mechanism is well suited as an interpretability method for NMT. Finally, based on our analysis, we propose methods that largely reduce the word alignment error rate compared to standard induced alignments from attention weights.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes