Improving Neural Machine Translation with Pre-trained Representation
This work addresses the challenge of enhancing translation quality in NMT by exploiting sentence-level representations, offering incremental improvements over existing methods.
The paper tackles the problem of underutilizing sentence-level contextual knowledge in neural machine translation by proposing a novel structure to leverage monolingual data for acquiring such representations, resulting in improved translation quality over strong Transformer baselines on Chinese-English and German-English tasks, with effectiveness also shown in low-resource English-Turkish scenarios.
Monolingual data has been demonstrated to be helpful in improving the translation quality of neural machine translation (NMT). The current methods stay at the usage of word-level knowledge, such as generating synthetic parallel data or extracting information from word embedding. In contrast, the power of sentence-level contextual knowledge which is more complex and diverse, playing an important role in natural language generation, has not been fully exploited. In this paper, we propose a novel structure which could leverage monolingual data to acquire sentence-level contextual representations. Then, we design a framework for integrating both source and target sentence-level representations into NMT model to improve the translation quality. Experimental results on Chinese-English, German-English machine translation tasks show that our proposed model achieves improvement over strong Transformer baselines, while experiments on English-Turkish further demonstrate the effectiveness of our approach in the low-resource scenario.