Multi-Way, Multilingual Neural Machine Translation with a Shared Attention Mechanism
This work addresses the challenge of efficient and effective multilingual translation for applications involving multiple languages, particularly benefiting low-resource scenarios, though it is incremental as it builds on existing neural machine translation frameworks.
The authors tackled the problem of scaling neural machine translation to multiple languages by proposing a multi-way, multilingual model with a shared attention mechanism, which reduces parameter growth to linear with language count and improves translation quality, especially for low-resource pairs, as demonstrated on ten WMT'15 language pairs.
We propose multi-way, multilingual neural machine translation. The proposed approach enables a single neural translation model to translate between multiple languages, with a number of parameters that grows only linearly with the number of languages. This is made possible by having a single attention mechanism that is shared across all language pairs. We train the proposed multi-way, multilingual model on ten language pairs from WMT'15 simultaneously and observe clear performance improvements over models trained on only one language pair. In particular, we observe that the proposed model significantly improves the translation quality of low-resource language pairs.