The AMU-UEDIN Submission to the WMT16 News Translation Task: Attention-based NMT Models as Feature Functions in Phrase-based SMT
This work addresses machine translation practitioners by showing hybrid methods can outperform pure neural models in specific language pairs, though it is incremental as it combines existing techniques.
The paper tackles the problem of integrating attention-based neural machine translation models with phrase-based statistical machine translation at decode time, proposing efficient GPU batch algorithms. For Russian-English, their system achieved top BLEU results, outperforming the best pure neural system by 1.1 BLEU points and improving their baseline by 1.6 BLEU, with follow-up experiments adding 0.8 BLEU.
This paper describes the AMU-UEDIN submissions to the WMT 2016 shared task on news translation. We explore methods of decode-time integration of attention-based neural translation models with phrase-based statistical machine translation. Efficient batch-algorithms for GPU-querying are proposed and implemented. For English-Russian, our system stays behind the state-of-the-art pure neural models in terms of BLEU. Among restricted systems, manual evaluation places it in the first cluster tied with the pure neural model. For the Russian-English task, our submission achieves the top BLEU result, outperforming the best pure neural system by 1.1 BLEU points and our own phrase-based baseline by 1.6 BLEU. After manual evaluation, this system is the best restricted system in its own cluster. In follow-up experiments we improve results by additional 0.8 BLEU.