CLMay 19, 2021

Learning Language Specific Sub-network for Multilingual Machine Translation

arXiv:2105.09259v2728 citationsHas Code
Originality Incremental advance
AI Analysis

This addresses the problem of parameter interference in multilingual translation models, offering a method to improve performance for rich-resource language pairs and enhance zero-shot translation, though it is incremental as it builds on existing multilingual frameworks.

The paper tackles performance degradation in multilingual machine translation due to parameter interference by proposing LaSS, which learns language-specific sub-networks for each language pair, achieving gains of up to 1.2 BLEU on 36 language pairs and boosting zero-shot translation by an average of 8.3 BLEU on 30 pairs.

Multilingual neural machine translation aims at learning a single translation model for multiple languages. These jointly trained models often suffer from performance degradation on rich-resource language pairs. We attribute this degeneration to parameter interference. In this paper, we propose LaSS to jointly train a single unified multilingual MT model. LaSS learns Language Specific Sub-network (LaSS) for each language pair to counter parameter interference. Comprehensive experiments on IWSLT and WMT datasets with various Transformer architectures show that LaSS obtains gains on 36 language pairs by up to 1.2 BLEU. Besides, LaSS shows its strong generalization performance at easy extension to new language pairs and zero-shot translation.LaSS boosts zero-shot translation with an average of 8.3 BLEU on 30 language pairs. Codes and trained models are available at https://github.com/NLP-Playground/LaSS.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes