CLAIJun 17, 2024

Unveiling the Power of Source: Source-based Minimum Bayes Risk Decoding for Neural Machine Translation

arXiv:2406.11632v52 citations
Originality Incremental advance
AI Analysis

This is an incremental improvement for neural machine translation decoding, addressing the gap where high estimated probability does not always lead to high translation quality.

The paper tackles the problem of neural machine translation decoding by proposing source-based Minimum Bayes Risk (sMBR) decoding, which uses quasi-sources and a reference-free quality estimation metric, and results show it outperforms QE reranking and standard MBR decoding.

Maximum a posteriori decoding, a commonly used method for neural machine translation (NMT), aims to maximize the estimated posterior probability. However, high estimated probability does not always lead to high translation quality. Minimum Bayes Risk (MBR) decoding offers an alternative by seeking hypotheses with the highest expected utility. Inspired by Quality Estimation (QE) reranking which uses the QE model as a ranker we propose source-based MBR (sMBR) decoding, a novel approach that utilizes quasi-sources (generated via paraphrasing or back-translation) as ``support hypotheses'' and a reference-free quality estimation metric as the utility function, marking the first work to solely use sources in MBR decoding. Experiments show that sMBR outperforms QE reranking and the standard MBR decoding. Our findings suggest that sMBR is a promising approach for NMT decoding.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes