CLMar 18, 2021

Improving the Lexical Ability of Pretrained Language Models for Unsupervised Neural Machine Translation

Alexandra Chronopoulou, Dario Stojanovski, Alexander Fraser

arXiv:2103.10531v231.8731 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of lexical alignment in unsupervised machine translation for low-resource and distant languages, representing an incremental improvement over existing methods.

The paper tackled the problem of poor performance in unsupervised neural machine translation for low-resource and distant languages by enhancing bilingual masked language model pretraining with lexical-level information, resulting in improved performance up to 4.5 BLEU on UNMT and better bilingual lexicon induction.

Successful methods for unsupervised neural machine translation (UNMT) employ crosslingual pretraining via self-supervision, often in the form of a masked language modeling or a sequence generation task, which requires the model to align the lexical- and high-level representations of the two languages. While cross-lingual pretraining works for similar languages with abundant corpora, it performs poorly in low-resource and distant languages. Previous research has shown that this is because the representations are not sufficiently aligned. In this paper, we enhance the bilingual masked language model pretraining with lexical-level information by using type-level cross-lingual subword embeddings. Empirical results demonstrate improved performance both on UNMT (up to 4.5 BLEU) and bilingual lexicon induction using our method compared to a UNMT baseline.

View on arXiv PDF Code

Similar