CL LGApr 28, 2020

Assessing the Bilingual Knowledge Learned by Neural Machine Translation Models

Shilin He, Xing Wang, Shuming Shi, Michael R. Lyu, Zhaopeng Tu

arXiv:2004.13270v11.37 citations

Originality Incremental advance

AI Analysis

This work provides interpretability for NMT models, which is a problem for researchers and practitioners in machine translation, though it is incremental as it builds on existing statistical methods.

The paper tackled the problem of understanding how neural machine translation (NMT) models learn bilingual knowledge by extracting interpretable phrase tables from correctly predicted training examples, finding that models learn patterns from simple to complex and distill essential knowledge, with experiments showing consistency across language pairs and random seeds.

Machine translation (MT) systems translate text between different languages by automatically learning in-depth knowledge of bilingual lexicons, grammar and semantics from the training examples. Although neural machine translation (NMT) has led the field of MT, we have a poor understanding on how and why it works. In this paper, we bridge the gap by assessing the bilingual knowledge learned by NMT models with phrase table -- an interpretable table of bilingual lexicons. We extract the phrase table from the training examples that an NMT model correctly predicts. Extensive experiments on widely-used datasets show that the phrase table is reasonable and consistent against language pairs and random seeds. Equipped with the interpretable phrase table, we find that NMT models learn patterns from simple to complex and distill essential bilingual knowledge from the training examples. We also revisit some advances that potentially affect the learning of bilingual knowledge (e.g., back-translation), and report some interesting findings. We believe this work opens a new angle to interpret NMT with statistic models, and provides empirical supports for recent advances in improving NMT models.

View on arXiv PDF

Similar