CLAILGMar 12, 2021

Bilingual Dictionary-based Language Model Pretraining for Neural Machine Translation

arXiv:2103.07040v11 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of high data costs in machine translation, particularly for low-resource languages, by offering a more efficient pretraining method.

The paper tackles the problem of reducing reliance on expensive parallel corpora for neural machine translation by proposing a Bilingual Dictionary-based Language Model (BDLM) that incorporates translation information from dictionaries during pretraining. It achieved a 55.0 BLEU on WMT-News19 and a 24.3 BLEU on WMT20 news-commentary for Chinese-English, outperforming the Vanilla Transformer by more than 8.4 and 2.3 BLEU, respectively.

Recent studies have demonstrated a perceivable improvement on the performance of neural machine translation by applying cross-lingual language model pretraining (Lample and Conneau, 2019), especially the Translation Language Modeling (TLM). To alleviate the need for expensive parallel corpora by TLM, in this work, we incorporate the translation information from dictionaries into the pretraining process and propose a novel Bilingual Dictionary-based Language Model (BDLM). We evaluate our BDLM in Chinese, English, and Romanian. For Chinese-English, we obtained a 55.0 BLEU on WMT-News19 (Tiedemann, 2012) and a 24.3 BLEU on WMT20 news-commentary, outperforming the Vanilla Transformer (Vaswani et al., 2017) by more than 8.4 BLEU and 2.3 BLEU, respectively. According to our results, the BDLM also has advantages on convergence speed and predicting rare words. The increase in BLEU for WMT16 Romanian-English also shows its effectiveness in low-resources language translation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes