Cross-Preference Learning for Sentence-Level and Context-Aware Machine Translation
This work addresses the variability in contextual benefits for machine translation, offering a method to improve translation quality and robustness, though it is incremental as it builds on existing preference-based training without new architectures.
The paper tackled the problem of context-aware machine translation not consistently outperforming sentence-level translation by proposing Cross-Preference Learning (CPL), a training framework that integrates intra- and cross-condition preferences to capture complementary benefits, resulting in consistent improvements in translation quality and robustness across multiple models like Qwen3-4B and Llama-3-8B without architectural changes.
Context-aware machine translation (MT) leverages document-level information, yet it does not consistently outperform sentence-level MT, as contextual signals are unevenly beneficial across sentences. Existing training objectives do not explicitly model this variability, limiting a model's ability to adaptively exploit context. In this paper, we propose Cross-Preference Learning (CPL), a preference-based training framework that explicitly captures the complementary benefits of sentence-level and context-aware MT. CPL achieves this by integrating both intra- and cross-condition preferences into the preference optimization objective. The introduction of intra- and cross-condition preferences provides explicit supervision on when and how contextual information improves translation quality. We validate the proposed approach on several public context-aware MT tasks using multiple models, including Qwen3-4B, Qwen3-8B, and Llama-3-8B. Experimental results demonstrate consistent improvements in translation quality and robustness across both input conditions, achieved without any architectural modifications.