WeChat Neural Machine Translation Systems for WMT20
This work provides a high-performing translation system for Chinese-to-English news tasks, but it is incremental as it builds on existing methods like Transformer and DTMT.
The paper tackled Chinese-to-English news translation for WMT20, achieving a state-of-the-art BLEU score of 36.9 with a system based on Transformer variants and DTMT architecture, enhanced by data selection, synthetic data generation, and model ensemble techniques.
We participate in the WMT 2020 shared news translation task on Chinese to English. Our system is based on the Transformer (Vaswani et al., 2017a) with effective variants and the DTMT (Meng and Zhang, 2019) architecture. In our experiments, we employ data selection, several synthetic data generation approaches (i.e., back-translation, knowledge distillation, and iterative in-domain knowledge transfer), advanced finetuning approaches and self-bleu based model ensemble. Our constrained Chinese to English system achieves 36.9 case-sensitive BLEU score, which is the highest among all submissions.