Fusing Recency into Neural Machine Translation with an Inter-Sentence Gate Model
This addresses translation inconsistency for users of NMT systems, though it is incremental as it builds on existing encoder-decoder frameworks.
The paper tackles the problem of neural machine translation ignoring inter-sentence information, which can lead to ambiguous or inconsistent translations, by proposing an inter-sentence gate model that fuses recency from neighboring sentences, achieving substantial improvements on NIST Chinese-English translation tasks.
Neural machine translation (NMT) systems are usually trained on a large amount of bilingual sentence pairs and translate one sentence at a time, ignoring inter-sentence information. This may make the translation of a sentence ambiguous or even inconsistent with the translations of neighboring sentences. In order to handle this issue, we propose an inter-sentence gate model that uses the same encoder to encode two adjacent sentences and controls the amount of information flowing from the preceding sentence to the translation of the current sentence with an inter-sentence gate. In this way, our proposed model can capture the connection between sentences and fuse recency from neighboring sentences into neural machine translation. On several NIST Chinese-English translation tasks, our experiments demonstrate that the proposed inter-sentence gate model achieves substantial improvements over the baseline.