Counter-Interference Adapter for Multilingual Machine Translation
This addresses a key bottleneck in multilingual translation for researchers and practitioners, offering a practical solution to enhance model performance across multiple languages.
The paper tackles performance degradation in unified multilingual machine translation models by proposing CIAT, an adapted Transformer model that reduces interference from joint training, achieving improvements of over 0.5 BLEU on 42 out of 66 language directions.
Developing a unified multilingual model has long been a pursuit for machine translation. However, existing approaches suffer from performance degradation -- a single multilingual model is inferior to separately trained bilingual ones on rich-resource languages. We conjecture that such a phenomenon is due to interference caused by joint training with multiple languages. To accommodate the issue, we propose CIAT, an adapted Transformer model with a small parameter overhead for multilingual machine translation. We evaluate CIAT on multiple benchmark datasets, including IWSLT, OPUS-100, and WMT. Experiments show that CIAT consistently outperforms strong multilingual baselines on 64 of total 66 language directions, 42 of which see above 0.5 BLEU improvement. Our code is available at \url{https://github.com/Yaoming95/CIAT}~.