CLNEJan 6, 2020

Exploring Benefits of Transfer Learning in Neural Machine Translation

arXiv:2001.01622v119 citations
AI Analysis

This addresses the challenge of low-resource language translation for NLP researchers and practitioners, but it is incremental as it builds on existing transfer learning concepts.

The paper tackles the problem of neural machine translation requiring large parallel datasets for low-resource language pairs by exploring cross-lingual transfer learning from high-resource models, showing improvements in translation performance, with a method achieving even larger gains when the high-resource model is prepared in advance.

Neural machine translation is known to require large numbers of parallel training sentences, which generally prevent it from excelling on low-resource language pairs. This thesis explores the use of cross-lingual transfer learning on neural networks as a way of solving the problem with the lack of resources. We propose several transfer learning approaches to reuse a model pretrained on a high-resource language pair. We pay particular attention to the simplicity of the techniques. We study two scenarios: (a) when we reuse the high-resource model without any prior modifications to its training process and (b) when we can prepare the first-stage high-resource model for transfer learning in advance. For the former scenario, we present a proof-of-concept method by reusing a model trained by other researchers. In the latter scenario, we present a method which reaches even larger improvements in translation performance. Apart from proposed techniques, we focus on an in-depth analysis of transfer learning techniques and try to shed some light on transfer learning improvements. We show how our techniques address specific problems of low-resource languages and are suitable even in high-resource transfer learning. We evaluate the potential drawbacks and behavior by studying transfer learning in various situations, for example, under artificially damaged training corpora, or with fixed various model parts.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes