CLMar 31, 2023

Exploiting Multilingualism in Low-resource Neural Machine Translation via Adversarial Learning

Amit Kumar, Ajay Pratap, Anil Kumar Singh

arXiv:2303.18011v10.92 citationsh-index: 16

Originality Incremental advance

AI Analysis

This work addresses performance issues in multilingual NMT for low-resource languages, offering an incremental improvement over existing methods.

The paper tackles the problem of performance degradation in multilingual neural machine translation (NMT) when using Generative Adversarial Networks (GANs) by proposing a Denoising Adversarial Auto-encoder-based Sentence Interpolation (DAASI) approach, which achieves up to a 4 BLEU point gain over state-of-the-art methods on low-resource language pairs and demonstrates robustness in zero-shot scenarios.

Generative Adversarial Networks (GAN) offer a promising approach for Neural Machine Translation (NMT). However, feeding multiple morphologically languages into a single model during training reduces the NMT's performance. In GAN, similar to bilingual models, multilingual NMT only considers one reference translation for each sentence during model training. This single reference translation limits the GAN model from learning sufficient information about the source sentence representation. Thus, in this article, we propose Denoising Adversarial Auto-encoder-based Sentence Interpolation (DAASI) approach to perform sentence interpolation by learning the intermediate latent representation of the source and target sentences of multilingual language pairs. Apart from latent representation, we also use the Wasserstein-GAN approach for the multilingual NMT model by incorporating the model generated sentences of multiple languages for reward computation. This computed reward optimizes the performance of the GAN-based multilingual model in an effective manner. We demonstrate the experiments on low-resource language pairs and find that our approach outperforms the existing state-of-the-art approaches for multilingual NMT with a performance gain of up to 4 BLEU points. Moreover, we use our trained model on zero-shot language pairs under an unsupervised scenario and show the robustness of the proposed approach.

View on arXiv PDF

Similar