AISep 23, 2024

Choose the Final Translation from NMT and LLM hypotheses Using MBR Decoding: HW-TSC's Submission to the WMT24 General MT Shared Task

Zhanglin Wu, Daimeng Wei, Zongyao Li, Hengchao Shang, Jiaxin Guo, Shaojun Li, Zhiqiang Rao, Yuanchang Luo, Ning Xie, Hao Yang

arXiv:2409.14800v122.022 citationsh-index: 11

Originality Synthesis-oriented

AI Analysis

This work addresses the challenge of enhancing translation quality for general machine translation tasks, but it is incremental as it builds on existing methods like MBR decoding with new model training strategies.

The paper tackles the problem of improving machine translation by using Minimum Bayesian Risk (MBR) decoding to select the best translation from hypotheses generated by both neural machine translation (NMT) and large language model (LLM)-based models, achieving competitive results in the WMT24 English-to-Chinese shared task.

This paper presents the submission of Huawei Translate Services Center (HW-TSC) to the WMT24 general machine translation (MT) shared task, where we participate in the English to Chinese (en2zh) language pair. Similar to previous years' work, we use training strategies such as regularized dropout, bidirectional training, data diversification, forward translation, back translation, alternated training, curriculum learning, and transductive ensemble learning to train the neural machine translation (NMT) model based on the deep Transformer-big architecture. The difference is that we also use continue pre-training, supervised fine-tuning, and contrastive preference optimization to train the large language model (LLM) based MT model. By using Minimum Bayesian risk (MBR) decoding to select the final translation from multiple hypotheses for NMT and LLM-based MT models, our submission receives competitive results in the final evaluation.

View on arXiv PDF

Similar