CLAILGJun 8, 2023

Improving Language Model Integration for Neural Machine Translation

arXiv:2306.05077v1223 citationsh-index: 104
Originality Incremental advance
AI Analysis

This work addresses a specific bottleneck in machine translation for researchers, but it is incremental as it transfers an existing concept from speech recognition.

The paper tackled the problem of integrating external language models into neural machine translation by neutralizing the implicit language model learned during training, which boosted performance but was still outperformed by back-translation methods.

The integration of language models for neural machine translation has been extensively studied in the past. It has been shown that an external language model, trained on additional target-side monolingual data, can help improve translation quality. However, there has always been the assumption that the translation model also learns an implicit target-side language model during training, which interferes with the external language model at decoding time. Recently, some works on automatic speech recognition have demonstrated that, if the implicit language model is neutralized in decoding, further improvements can be gained when integrating an external language model. In this work, we transfer this concept to the task of machine translation and compare with the most prominent way of including additional monolingual data - namely back-translation. We find that accounting for the implicit language model significantly boosts the performance of language model fusion, although this approach is still outperformed by back-translation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes