CLMar 11, 2021

Towards Continual Learning for Multilingual Machine Translation via Vocabulary Substitution

Xavier Garcia, Noah Constant, Ankur P. Parikh, Orhan Firat

arXiv:2103.06799v132.0733 citations

Originality Incremental advance

AI Analysis

This addresses the challenge of continual learning for multilingual machine translation, enabling models to incorporate new languages without extensive retraining, though it is incremental as it builds on existing vocabulary adaptation techniques.

The paper tackles the problem of extending multilingual machine translation models to new languages efficiently, proposing a vocabulary adaptation scheme that maintains competitive performance on original languages while adding distant languages with unseen scripts, with only minor degradation in translation quality.

We propose a straightforward vocabulary adaptation scheme to extend the language capacity of multilingual machine translation models, paving the way towards efficient continual learning for multilingual machine translation. Our approach is suitable for large-scale datasets, applies to distant languages with unseen scripts, incurs only minor degradation on the translation performance for the original language pairs and provides competitive performance even in the case where we only possess monolingual data for the new languages.

View on arXiv PDF

Similar