CLMay 10, 2016

Vocabulary Manipulation for Neural Machine Translation

arXiv:1605.03209v167 citations
Originality Incremental advance
AI Analysis

This addresses efficiency issues for neural machine translation practitioners, though it is incremental as it builds on existing vocabulary reduction methods.

The paper tackles the high computational and memory costs of large vocabularies in neural machine translation by introducing sentence- or batch-level vocabularies, reducing time and memory usage while improving translation performance by 1 BLEU point on an English-to-French task.

In order to capture rich language phenomena, neural machine translation models have to use a large vocabulary size, which requires high computing time and large memory usage. In this paper, we alleviate this issue by introducing a sentence-level or batch-level vocabulary, which is only a very small sub-set of the full output vocabulary. For each sentence or batch, we only predict the target words in its sentence-level or batch-level vocabulary. Thus, we reduce both the computing time and the memory usage. Our method simply takes into account the translation options of each word or phrase in the source sentence, and picks a very small target vocabulary for each sentence based on a word-to-word translation model or a bilingual phrase library learned from a traditional machine translation model. Experimental results on the large-scale English-to-French task show that our method achieves better translation performance by 1 BLEU point over the large vocabulary neural machine translation system of Jean et al. (2015).

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes