CLNov 1, 2018

Addressing word-order Divergence in Multilingual Neural Machine Translation for extremely Low Resource Languages

Rudra Murthy, Anoop Kunchukuttan, Pushpak Bhattacharyya

arXiv:1811.00383v232.11103 citations

Originality Incremental advance

AI Analysis

This addresses translation quality issues for low-resource languages where parallel data is scarce, though it is incremental as it builds on existing transfer learning methods.

The paper tackled the problem of divergent word order limiting transfer learning benefits in multilingual neural machine translation for low-resource languages, and showed that pre-ordering the assisting language to match the source language's word order leads to significant improvements in translation quality.

Transfer learning approaches for Neural Machine Translation (NMT) train a NMT model on the assisting-target language pair (parent model) which is later fine-tuned for the source-target language pair of interest (child model), with the target language being the same. In many cases, the assisting language has a different word order from the source language. We show that divergent word order adversely limits the benefits from transfer learning when little to no parallel corpus between the source and target language is available. To bridge this divergence, We propose to pre-order the assisting language sentence to match the word order of the source language and train the parent model. Our experiments on many language pairs show that bridging the word order gap leads to significant improvement in the translation quality.

View on arXiv PDF

Similar