Efficient Minimum Bayes Risk Decoding using Low-Rank Matrix Completion Algorithms
This addresses a computational bottleneck for researchers and practitioners using MBR decoding in text generation, though it is an incremental improvement over existing approximation methods.
The paper tackled the high computational cost of Minimum Bayes Risk (MBR) decoding in machine translation by approximating it as a low-rank matrix completion problem, reducing utility metric computations to 1/16 of the original while maintaining equal translation quality on WMT22 datasets.
Minimum Bayes Risk (MBR) decoding is a powerful decoding strategy widely used for text generation tasks, but its quadratic computational complexity limits its practical application. This paper presents a novel approach for approximating MBR decoding using matrix completion techniques, focusing on the task of machine translation. We formulate MBR decoding as a matrix completion problem, where the utility metric scores between candidate hypotheses and pseudo-reference translations form a low-rank matrix. First, we empirically show that the scores matrices indeed have a low-rank structure. Then, we exploit this by only computing a random subset of the scores and efficiently recover the missing entries in the matrix by applying the Alternating Least Squares (ALS) algorithm, thereby enabling a fast approximation of the MBR decoding process. Our experimental results on machine translation tasks demonstrate that the proposed method requires 1/16 utility metric computations compared to vanilla MBR decoding while achieving equal translation quality measured by COMET22 on the WMT22 dataset (en<>de and en<>ru). We also benchmark our method against other approximation methods and we show gains in quality when comparing to them.