LGCRApr 13

Reducing Hallucination in Enterprise AI Workflows via Hybrid Utility Minimum Bayes Risk (HUMBR)

arXiv:2604.1114177.7h-index: 4
Predicted impact top 17% in LG · last 90 daysOriginality Incremental advance
AI Analysis

For organizations like Meta deploying LLMs in legal, risk, and compliance workflows, this method dramatically reduces hallucination risk, addressing a critical bottleneck in enterprise AI.

The paper tackles hallucination in high-stakes enterprise LLM workflows by framing mitigation as a Minimum Bayes Risk (MBR) problem. The proposed HUMBR framework reduces critical recall failures and achieves 81% preference over human-crafted ground truth.

Although LLMs drive automation, it is critical to ensure immense consideration for high-stakes enterprise workflows such as those involving legal matters, risk management, and privacy compliance. For Meta, and other organizations like ours, a single hallucinated clause in such high stakes workflows risks material consequences. We show that by framing hallucination mitigation as a Minimum Bayes Risk (MBR) problem, we can dramatically reduce this risk. Specifically, we introduce a Hybrid Utility MBR (HUMBR) framework that synthesizes semantic embedding similarity with lexical precision to identify consensus without ground-truth references, for which we derive rigorous error bounds. We complement this theoretical analysis with a comprehensive empirical evaluation on widely-used public benchmark suites (TruthfulQA and LegalBench) and also real world data from Meta production deployment. The results from our empirical study show that MBR significantly outperforms standard Universal Self-Consistency. Notably, 81% of the pipeline's suggestions were preferred over human-crafted ground truth, and critical recall failures were virtually eliminated.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes