CL LGJun 30, 2019

The University of Sydney's Machine Translation System for WMT19

arXiv:1907.00494v11095 citations

Originality Incremental advance

AI Analysis

This work addresses machine translation for a specific language pair, achieving state-of-the-art results in a competitive benchmark, but it is incremental as it builds on existing Transformer-based methods.

The paper tackled machine translation from Finnish to English for the WMT19 shared task, achieving the best BLEU score of 33.0 among participants by integrating various strategies and proposing novel methods like Cycle Translation and Big/Small parallel construction.

This paper describes the University of Sydney's submission of the WMT 2019 shared news translation task. We participated in the Finnish$\rightarrow$English direction and got the best BLEU(33.0) score among all the participants. Our system is based on the self-attentional Transformer networks, into which we integrated the most recent effective strategies from academic research (e.g., BPE, back translation, multi-features data selection, data augmentation, greedy model ensemble, reranking, ConMBR system combination, and post-processing). Furthermore, we propose a novel augmentation method $Cycle Translation$ and a data mixture strategy $Big$/$Small$ parallel construction to entirely exploit the synthetic corpus. Extensive experiments show that adding the above techniques can make continuous improvements of the BLEU scores, and the best result outperforms the baseline (Transformer ensemble model trained with the original parallel corpus) by approximately 5.3 BLEU score, achieving the state-of-the-art performance.

View on arXiv PDF

Similar