AIFLDec 5, 2025

BEAVER: An Efficient Deterministic LLM Verifier

arXiv:2512.05439v12 citations
Originality Highly original
AI Analysis

This addresses the need for reliable verification in production LLM systems, offering a practical solution with provable guarantees, though it is incremental as it builds on existing verification concepts.

The paper tackles the problem of verifying that large language model outputs satisfy constraints by introducing BEAVER, a framework that computes deterministic, sound probability bounds, achieving 6 to 8 times tighter bounds and identifying 3 to 4 times more high-risk instances compared to baselines.

As large language models (LLMs) transition from research prototypes to production systems, practitioners often need reliable methods to verify that model outputs satisfy required constraints. While sampling-based estimates provide an intuition of model behavior, they offer no sound guarantees. We present BEAVER, the first practical framework for computing deterministic, sound probability bounds on LLM constraint satisfaction. Given any prefix-closed semantic constraint, BEAVER systematically explores the generation space using novel token trie and frontier data structures, maintaining provably sound bounds at every iteration. We formalize the verification problem, prove soundness of our approach, and evaluate BEAVER on correctness verification, privacy verification and secure code generation tasks across multiple state of the art LLMs. BEAVER achieves 6 to 8 times tighter probability bounds and identifies 3 to 4 times more high risk instances compared to baseline methods under identical computational budgets, enabling precise characterization and risk assessment that loose bounds or empirical evaluation cannot provide.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes