CLJan 9

A Rising Tide Lifts All Boats: MTQE Rewards for Idioms Improve General Translation Quality

arXiv:2601.06307v1h-index: 7
Originality Incremental advance
AI Analysis

This work addresses the problem of accurate cross-cultural and figurative language translation for machine translation systems, though it is incremental as it builds on existing fine-tuning methods.

The paper tackled the challenge of translating non-compositional expressions like idioms in neural machine translation by fine-tuning models with MTQE rewards, resulting in improvements of ~14 points for idiom translation, ~8 points for general translation, and ~6 points for cross-lingual abilities.

Non-compositional expressions (e.g., idioms, proverbs, and metaphors) pose significant challenges for neural machine translation systems because their meanings cannot be derived from individual words alone. These expressions encode rich, cultural meaning, and have both figurative and literal meanings, making accurate translation difficult. Because models are fairly good at translating compositional text, we investigate GRPO-style fine-tuning using Machine Translation Quality Estimation (MTQE) models as reward functions to train models to better translate idioms. Using Chinese and Hindi idiom datasets, we find that idiom translation abilities improve by ~14 points, general, non-idiomatic translation implicitly improves by ~8 points, and cross-lingual translation abilities (trained on one language, evaluated on another) improves by ~6 points. Overall, our work quantifies the non-compositional translation gap and offers insights for developing LLMs with stronger cross-cultural and figurative language understanding.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes