CLAIMay 29, 2025

Document-Level Text Generation with Minimum Bayes Risk Decoding using Optimal Transport

arXiv:2505.23078v11 citationsh-index: 13Has CodeACL
Originality Incremental advance
AI Analysis

This addresses the problem of generating high-quality documents for NLP applications, but it is incremental as it adapts an existing method to a more complex setting.

The paper tackled the challenge of applying Minimum Bayes Risk (MBR) decoding to document-level text generation by proposing MBR-OT, which uses Wasserstein distance to compute document utility from sentence-level functions, resulting in improved performance over standard MBR in tasks like machine translation, text simplification, and dense image captioning.

Document-level text generation tasks are known to be more difficult than sentence-level text generation tasks as they require the understanding of longer context to generate high-quality texts. In this paper, we investigate the adaption of Minimum Bayes Risk (MBR) decoding for document-level text generation tasks. MBR decoding makes use of a utility function to estimate the output with the highest expected utility from a set of candidate outputs. Although MBR decoding is shown to be effective in a wide range of sentence-level text generation tasks, its performance on document-level text generation tasks is limited as many of the utility functions are designed for evaluating the utility of sentences. To this end, we propose MBR-OT, a variant of MBR decoding using Wasserstein distance to compute the utility of a document using a sentence-level utility function. The experimental result shows that the performance of MBR-OT outperforms that of the standard MBR in document-level machine translation, text simplification, and dense image captioning tasks. Our code is available at https://github.com/jinnaiyuu/mbr-optimal-transport

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes