CLAILGMay 8

MAVEN: Multi-Agent Verification-Elaboration Network with In-Step Epistemic Auditing

arXiv:2605.0764659.8
AI Analysis

For high-stakes applications requiring epistemic trust, MAVEN provides a model-agnostic reasoning booster that enhances interpretability and accuracy through structured multi-agent deliberation.

MAVEN introduces a multi-agent framework with an adversarial Skeptic-Researcher-Judge loop to improve LLM reasoning by enabling explicit, modular, and verifiable deliberation. It outperforms models like GEMINI-3.1-Pro and ReConcile on OpenBookQA, TruthfulQA, HALUEVAL, and StrategyQA across four metrics.

While explicit reasoning trajectories enhance model interpretability, existing paradigms often rely on monolithic chains that lack intermediate verification, allowing early errors to cascade unchecked. This lack of modularity impedes granular auditing and compromises the epistemic trust required for high-stakes applications. We propose MAVEN (Multi-Agent Verification-Elaboration Network with In-Step Epistemic Auditing), a blackboard-inspired framework designed to transform LLMs into deliberate reasoners through explicit role-decoupling. At its core, MAVEN operationalizes an adversarial Skeptic-Researcher-Judge loop, simulating expert deliberation by functionally separating logical defense from factual grounding. Experiments on OpenBookQA, TruthfulQA, HALUEVAL and StrategyQA benchmarks demonstrate that MAVEN delivers superior reasoning quality across four fine-grained metrics. Notably, MAVEN consistently outperforms latent reasoning models such as GEMINI-3.1-Pro and consensus-based baselines (e.g., ReConcile) by generating explicitly structured, modular, and verifiable deliberation trajectories, rather than relying on implicit internal states or post-hoc consensus. Moreover, comprehensive evaluations confirm that MAVEN is fully model-agnostic, serving as a strong and transferable reasoning booster that yields substantial performance improvements across diverse backbone models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes