CLAug 8, 2024

mbrs: A Library for Minimum Bayes Risk Decoding

Hiroyuki Deguchi, Yusuke Sakai, Hidetaka Kamigaito, Taro Watanabe

arXiv:2408.04167v215.427 citationsh-index: 15Has Code

Originality Synthesis-oriented

AI Analysis

This is an incremental contribution providing a tool for researchers and developers in natural language processing to enhance text generation tasks.

The authors tackled the problem of improving text generation by introducing mbrs, a library for Minimum Bayes Risk (MBR) decoding that outperforms conventional maximum a posteriori decoding, offering flexibility in metrics and algorithms with a focus on speed, transparency, and reproducibility.

Minimum Bayes risk (MBR) decoding is a decision rule of text generation tasks that outperforms conventional maximum a posterior (MAP) decoding using beam search by selecting high-quality outputs based on a utility function rather than those with high-probability. Typically, it finds the most suitable hypothesis from the set of hypotheses under the sampled pseudo-references. mbrs is a library of MBR decoding, which can flexibly combine various metrics, alternative expectation estimations, and algorithmic variants. It is designed with a focus on speed measurement and calling count of code blocks, transparency, reproducibility, and extensibility, which are essential for researchers and developers. We published our mbrs as an MIT-licensed open-source project, and the code is available on GitHub. GitHub: https://github.com/naist-nlp/mbrs

View on arXiv PDF Code

Similar