CLJun 25, 2019

Saliency-driven Word Alignment Interpretation for Neural Machine Translation

arXiv:1906.10282v21113 citations
AI Analysis

This addresses the interpretability gap in NMT for researchers and practitioners, though it is incremental as it builds on existing interpretation methods.

The paper tackles the problem that Neural Machine Translation models are perceived as not learning interpretable word alignments, and shows that they do learn such alignments when revealed with proper interpretation methods, achieving alignments of better quality than fast-align in some systems and agreeing well with automatic alignment tools.

Despite their original goal to jointly learn to align and translate, Neural Machine Translation (NMT) models, especially Transformer, are often perceived as not learning interpretable word alignments. In this paper, we show that NMT models do learn interpretable word alignments, which could only be revealed with proper interpretation methods. We propose a series of such methods that are model-agnostic, are able to be applied either offline or online, and do not require parameter update or architectural change. We show that under the force decoding setup, the alignments induced by our interpretation method are of better quality than fast-align for some systems, and when performing free decoding, they agree well with the alignments induced by automatic alignment tools.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes