CLAIOct 15, 2024

PMMT: Preference Alignment in Multilingual Machine Translation via LLM Distillation

arXiv:2410.11410v1h-index: 1
Originality Incremental advance
AI Analysis

This addresses the need for more human-aligned translations in cross-language communication, though it is incremental as it builds on existing LLM and distillation techniques.

The paper tackles the problem of aligning machine translations with human preferences like tone and style by proposing a method to generate multilingual parallel corpora with specific preferences using LLMs and distilling these preferences into smaller MT models. Experiments show the method leads in preference-aligned translation tasks and achieves competitive performance on standard benchmarks like WMT and Flores.

Translation is important for cross-language communication, and many efforts have been made to improve its accuracy. However, less investment is conducted in aligning translations with human preferences, such as translation tones or styles. In this paper, a new method is proposed to effectively generate large-scale multilingual parallel corpora with specific translation preferences using Large Language Models (LLMs). Meanwhile, an automatic pipeline is designed to distill human preferences into smaller Machine Translation (MT) models for efficiently and economically supporting large-scale calls in online services. Experiments indicate that the proposed method takes the lead in translation tasks with aligned human preferences by a large margin. Meanwhile, on popular public benchmarks like WMT and Flores, on which our models were not trained, the proposed method also shows a competitive performance compared to SOTA works.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes