CLApr 23, 2017

Neural Machine Translation via Binary Code Prediction

arXiv:1704.06918v131 citations
Originality Incremental advance
AI Analysis

This addresses efficiency issues in NMT for practitioners, though it is incremental as it builds on existing binary code prediction methods.

The paper tackles the computational and memory inefficiency of the output layer in neural machine translation by proposing a method that predicts binary codes for each word, reducing memory usage to less than 1/10 and improving decoding speed by 5-10 times on CPUs while achieving BLEU scores close to softmax.

In this paper, we propose a new method for calculating the output layer in neural machine translation systems. The method is based on predicting a binary code for each word and can reduce computation time/memory requirements of the output layer to be logarithmic in vocabulary size in the best case. In addition, we also introduce two advanced approaches to improve the robustness of the proposed model: using error-correcting codes and combining softmax and binary codes. Experiments on two English-Japanese bidirectional translation tasks show proposed models achieve BLEU scores that approach the softmax, while reducing memory usage to the order of less than 1/10 and improving decoding speed on CPUs by x5 to x10.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes