AIAug 6, 2025

Deliberative Reasoning Network: An Uncertainty-Driven Paradigm for Belief-Tracked Inference with Pretrained Language Models

Anran Xu, Jincheng Wang, Baigen Cai, Tao Wen

arXiv:2508.04339v1h-index: 6

Originality Highly original

AI Analysis

This addresses a fundamental limitation in AI reasoning for building more trustworthy systems, though it appears incremental as it builds on existing LLMs.

The paper tackles the problem of logical reasoning failures in large language models when semantic heuristics conflict with evidence, introducing the Deliberative Reasoning Network (DRN) that reframes reasoning as uncertainty minimization, achieving up to 15.2% improvement on a new benchmark and boosting Mistral-7B accuracy from 20% to 80% on challenging problems.

Large language models often fail at logical reasoning when semantic heuristics conflict with decisive evidence - a phenomenon we term cognitive traps. To address this fundamental limitation, we introduce the Deliberative Reasoning Network (DRN), a novel paradigm that reframes logical reasoning from probability maximization to uncertainty minimization. Instead of asking "Which answer is most likely?", DRN asks "Which hypothesis has the most internally consistent evidence?". DRN achieves intrinsic interpretability by explicitly tracking belief states and quantifying epistemic uncertainty for competing hypotheses through an iterative evidence synthesis process. We validate our approach through two complementary architectures - a bespoke discriminative model that embodies the core uncertainty minimization principle, and a lightweight verification module that enhances existing generative LLMs. Evaluated on LCR-1000, our new adversarial reasoning benchmark designed to expose cognitive traps, the bespoke DRN achieves up to 15.2% improvement over standard baselines. When integrated as a parameter-efficient verifier with Mistral-7B, our hybrid system boosts accuracy from 20% to 80% on the most challenging problems. Critically, DRN demonstrates strong zero-shot generalization, improving TruthfulQA performance by 23.6% without additional training, indicating that uncertainty-driven deliberation learns transferable reasoning principles. We position DRN as a foundational, verifiable System 2 reasoning component for building more trustworthy AI systems.

View on arXiv PDF

Similar