CLJun 3, 2018

Dense Information Flow for Neural Machine Translation

arXiv:1806.00722v21113 citations
Originality Incremental advance
AI Analysis

This work addresses translation quality and efficiency for machine translation users, but it is incremental as it adapts an existing computer vision model to NMT.

The paper tackled the problem of improving neural machine translation by proposing a densely connected architecture (DenseNMT) that enhances feature creation and attention quality, resulting in more competitive and efficient performance on multiple datasets.

Recently, neural machine translation has achieved remarkable progress by introducing well-designed deep neural networks into its encoder-decoder framework. From the optimization perspective, residual connections are adopted to improve learning performance for both encoder and decoder in most of these deep architectures, and advanced attention connections are applied as well. Inspired by the success of the DenseNet model in computer vision problems, in this paper, we propose a densely connected NMT architecture (DenseNMT) that is able to train more efficiently for NMT. The proposed DenseNMT not only allows dense connection in creating new features for both encoder and decoder, but also uses the dense attention structure to improve attention quality. Our experiments on multiple datasets show that DenseNMT structure is more competitive and efficient.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes