A Hierarchical Neural Network for Sequence-to-Sequences Learning
This addresses a specific challenge in machine translation for long sentences, representing an incremental improvement.
The paper tackles the problem of lower translation quality on long sentences in Neural Machine Translation by proposing a hierarchical deep neural network architecture, achieving superior results with higher BLEU scores, lower perplexity, and better performance in expression style and word usage compared to traditional networks.
In recent years, the sequence-to-sequence learning neural networks with attention mechanism have achieved great progress. However, there are still challenges, especially for Neural Machine Translation (NMT), such as lower translation quality on long sentences. In this paper, we present a hierarchical deep neural network architecture to improve the quality of long sentences translation. The proposed network embeds sequence-to-sequence neural networks into a two-level category hierarchy by following the coarse-to-fine paradigm. Long sentences are input by splitting them into shorter sequences, which can be well processed by the coarse category network as the long distance dependencies for short sentences is able to be handled by network based on sequence-to-sequence neural network. Then they are concatenated and corrected by the fine category network. The experiments shows that our method can achieve superior results with higher BLEU(Bilingual Evaluation Understudy) scores, lower perplexity and better performance in imitating expression style and words usage than the traditional networks.