CLApr 17, 2018

When and Why are Pre-trained Word Embeddings Useful for Neural Machine Translation?

arXiv:1804.06323v21234 citations
AI Analysis

This addresses the problem of data scarcity in NMT for researchers and practitioners, offering insights into when embeddings help, but it is incremental as it builds on existing embedding methods.

The paper investigates the effectiveness of pre-trained word embeddings in neural machine translation, particularly in low-resource scenarios, and finds they can improve performance by up to 20 BLEU points in favorable conditions.

The performance of Neural Machine Translation (NMT) systems often suffers in low-resource scenarios where sufficiently large-scale parallel corpora cannot be obtained. Pre-trained word embeddings have proven to be invaluable for improving performance in natural language analysis tasks, which often suffer from paucity of data. However, their utility for NMT has not been extensively explored. In this work, we perform five sets of experiments that analyze when we can expect pre-trained word embeddings to help in NMT tasks. We show that such embeddings can be surprisingly effective in some cases -- providing gains of up to 20 BLEU points in the most favorable setting.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes