CLJun 22, 2017

Neural Machine Translation with Gumbel-Greedy Decoding

arXiv:1706.07518v140 citations
Originality Incremental advance
AI Analysis

This addresses the need for more efficient decoding in machine translation, though it appears incremental as it builds on existing reparameterization techniques.

The paper tackles the problem of avoiding heuristic search in neural machine translation by proposing Gumbel-Greedy Decoding, which trains a generative network using Gumbel-Softmax reparameterization, and empirically shows it is effective for generating discrete word sequences.

Previous neural machine translation models used some heuristic search algorithms (e.g., beam search) in order to avoid solving the maximum a posteriori problem over translation sentences at test time. In this paper, we propose the Gumbel-Greedy Decoding which trains a generative network to predict translation under a trained model. We solve such a problem using the Gumbel-Softmax reparameterization, which makes our generative network differentiable and trainable through standard stochastic gradient methods. We empirically demonstrate that our proposed model is effective for generating sequences of discrete words.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes