Viewing Knowledge Transfer in Multilingual Machine Translation Through a Representational Lens
This work addresses the challenge of understanding and enhancing knowledge transfer in multilingual machine translation, particularly for low-resource languages, though it is incremental as it builds on existing representational analysis methods.
The authors tackled the problem of measuring knowledge transfer in multilingual machine translation beyond translation quality, introducing Representational Transfer Potential (RTP) to assess representational similarities and showing it correlates with quality changes. They developed a training scheme using an auxiliary similarity loss that improved translation quality for low- and mid-resource languages across multiple setups.
We argue that translation quality alone is not a sufficient metric for measuring knowledge transfer in multilingual neural machine translation. To support this claim, we introduce Representational Transfer Potential (RTP), which measures representational similarities between languages. We show that RTP can measure both positive and negative transfer (interference), and find that RTP is strongly correlated with changes in translation quality, indicating that transfer does occur. Furthermore, we investigate data and language characteristics that are relevant for transfer, and find that multi-parallel overlap is an important yet under-explored feature. Based on this, we develop a novel training scheme, which uses an auxiliary similarity loss that encourages representations to be more invariant across languages by taking advantage of multi-parallel data. We show that our method yields increased translation quality for low- and mid-resource languages across multiple data and model setups.