CLOct 1, 2019

A Survey of Methods to Leverage Monolingual Data in Low-resource Neural Machine Translation

arXiv:1910.00373v115 citations
Originality Synthesis-oriented
AI Analysis

This is a survey paper, so it is incremental, summarizing existing approaches without new results.

The paper reviews methods to improve neural machine translation for low-resource languages by leveraging monolingual data, which is more abundant than parallel data, to address translation quality issues.

Neural machine translation has become the state-of-the-art for language pairs with large parallel corpora. However, the quality of machine translation for low-resource languages leaves much to be desired. There are several approaches to mitigate this problem, such as transfer learning, semi-supervised and unsupervised learning techniques. In this paper, we review the existing methods, where the main idea is to exploit the power of monolingual data, which, compared to parallel, is usually easier to obtain and significantly greater in amount.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes