CL AISep 24, 2024

Exploring the traditional NMT model and Large Language Model for chat translation

Jinlong Yang, Hengchao Shang, Daimeng Wei, Jiaxin Guo, Zongyao Li, Zhanglin Wu, Zhiqiang Rao, Shaojun Li, Yuhao Xie, Yuanchang Luo, Jiawei Zheng, Bin Wei

arXiv:2409.16331v112.622 citationsh-index: 26

Originality Synthesis-oriented

AI Analysis

This work addresses chat translation for users needing real-time communication, but it is incremental as it applies existing methods to a specific shared task.

This paper tackled chat translation between English and German by fine-tuning models with chat data and exploring strategies like Minimum Bayesian Risk decoding and self-training, achieving significant performance improvements in certain directions with the MBR self-training method yielding the best results.

This paper describes the submissions of Huawei Translation Services Center(HW-TSC) to WMT24 chat translation shared task on English$\leftrightarrow$Germany (en-de) bidirection. The experiments involved fine-tuning models using chat data and exploring various strategies, including Minimum Bayesian Risk (MBR) decoding and self-training. The results show significant performance improvements in certain directions, with the MBR self-training method achieving the best results. The Large Language Model also discusses the challenges and potential avenues for further research in the field of chat translation.

View on arXiv PDF

Similar