CLAIOct 14, 2024

QE-EBM: Using Quality Estimators as Energy Loss for Machine Translation

arXiv:2410.10228v11 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of exploiting gradient information from quality estimators in machine translation, particularly benefiting low-resource languages.

The paper tackles the problem of machine translation alignment by proposing QE-EBM, which uses quality estimators as trainable loss networks to directly backpropagate gradients to neural machine translation models, achieving improvements such as 2.5 BLEU and 7.1 COMET-KIWI for English-to-Mongolian translation over supervised baselines.

Reinforcement learning has shown great promise in aligning language models with human preferences in a variety of text generation tasks, including machine translation. For translation tasks, rewards can easily be obtained from quality estimation (QE) models which can generate rewards for unlabeled data. Despite its usefulness, reinforcement learning cannot exploit the gradients with respect to the QE score. We propose QE-EBM, a method of employing quality estimators as trainable loss networks that can directly backpropagate to the NMT model. We examine our method on several low and high resource target languages with English as the source language. QE-EBM outperforms strong baselines such as REINFORCE and proximal policy optimization (PPO) as well as supervised fine-tuning for all target languages, especially low-resource target languages. Most notably, for English-to-Mongolian translation, our method achieves improvements of 2.5 BLEU, 7.1 COMET-KIWI, 5.3 COMET, and 6.4 XCOMET relative to the supervised baseline.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes