CLMar 10, 2021

Self-Learning for Zero Shot Neural Machine Translation

Surafel M. Lakew, Matteo Negri, Marco Turchi

arXiv:2103.05951v10.71 citations

Originality Highly original

AI Analysis

This work addresses the challenge of low-resource language translation for users needing cross-lingual communication without parallel data, representing a novel method rather than an incremental improvement.

The paper tackles the problem of zero-shot neural machine translation without relying on a pivot language, achieving up to +5.93 BLEU improvement over supervised bilingual baselines across diverse language pairs.

Neural Machine Translation (NMT) approaches employing monolingual data are showing steady improvements in resource rich conditions. However, evaluations using real-world low-resource languages still result in unsatisfactory performance. This work proposes a novel zero-shot NMT modeling approach that learns without the now-standard assumption of a pivot language sharing parallel data with the zero-shot source and target languages. Our approach is based on three stages: initialization from any pre-trained NMT model observing at least the target language, augmentation of source sides leveraging target monolingual data, and learning to optimize the initial model to the zero-shot pair, where the latter two constitute a self-learning cycle. Empirical findings involving four diverse (in terms of a language family, script and relatedness) zero-shot pairs show the effectiveness of our approach with up to +5.93 BLEU improvement against a supervised bilingual baseline. Compared to unsupervised NMT, consistent improvements are observed even in a domain-mismatch setting, attesting to the usability of our method.

View on arXiv PDF

Similar