Multilingual Machine Translation Systems from Microsoft for WMT21 Shared Task
This work addresses the challenge of large-scale multilingual machine translation for translation researchers and practitioners, though it appears incremental as it builds on existing pre-trained models and techniques.
Microsoft developed multilingual machine translation systems for the WMT21 shared task, achieving first-place rankings on all three evaluation tracks using automatic metrics.
This report describes Microsoft's machine translation systems for the WMT21 shared task on large-scale multilingual machine translation. We participated in all three evaluation tracks including Large Track and two Small Tracks where the former one is unconstrained and the latter two are fully constrained. Our model submissions to the shared task were initialized with DeltaLM\footnote{\url{https://aka.ms/deltalm}}, a generic pre-trained multilingual encoder-decoder model, and fine-tuned correspondingly with the vast collected parallel data and allowed data sources according to track settings, together with applying progressive learning and iterative back-translation approaches to further improve the performance. Our final submissions ranked first on three tracks in terms of the automatic evaluation metric.