Improving Similar Language Translation With Transfer Learning
This work addresses the problem of low-resource similar language translation for multilingual NLP applications, but it is incremental as it applies existing transfer learning methods to new language pairs.
The authors tackled translation between similar languages using transfer learning from pre-trained neural machine translation models, achieving top BLEU scores of 82.79 for Catalan-Spanish and 87.11 for Portuguese-Spanish in the WMT 2021 shared task.
We investigate transfer learning based on pre-trained neural machine translation models to translate between (low-resource) similar languages. This work is part of our contribution to the WMT 2021 Similar Languages Translation Shared Task where we submitted models for different language pairs, including French-Bambara, Spanish-Catalan, and Spanish-Portuguese in both directions. Our models for Catalan-Spanish ($82.79$ BLEU) and Portuguese-Spanish ($87.11$ BLEU) rank top 1 in the official shared task evaluation, and we are the only team to submit models for the French-Bambara pairs.