Information-Propogation-Enhanced Neural Machine Translation by Relation Model
This addresses the bottleneck of capturing long-term dependencies in NMT, which is crucial for improving translation quality, though it appears incremental as it builds on existing encoder-decoder frameworks.
The paper tackles the problem of long-distance dependency in neural machine translation by incorporating a relation network into the encoder-decoder framework to enhance information propagation, resulting in significant outperformance over statistical and state-of-the-art NMT models on two datasets.
Even though sequence-to-sequence neural machine translation (NMT) model have achieved state-of-art performance in the recent fewer years, but it is widely concerned that the recurrent neural network (RNN) units are very hard to capture the long-distance state information, which means RNN can hardly find the feature with long term dependency as the sequence becomes longer. Similarly, convolutional neural network (CNN) is introduced into NMT for speeding recently, however, CNN focus on capturing the local feature of the sequence; To relieve this issue, we incorporate a relation network into the standard encoder-decoder framework to enhance information-propogation in neural network, ensuring that the information of the source sentence can flow into the decoder adequately. Experiments show that proposed framework outperforms the statistical MT model and the state-of-art NMT model significantly on two data sets with different scales.