AIMay 23

Understanding and Mitigating Premature Confidence for Better LLM Reasoning

Jingchu Gai, Guanning Zeng, Christina Baek, Chen Wu, J. Zico Kolter, Andrej Risteski, Aditi Raghunathan

arXiv:2605.2439642.7

Predicted impact top 3% in AI · last 90 daysOriginality Highly original

AI Analysis

For researchers and practitioners building LLM reasoning systems, this work provides a scalable, label-free method to improve reasoning quality and faithfulness by addressing a previously underexplored failure mode.

The paper identifies premature confidence (early commitment to an answer) as a key cause of flawed reasoning in LLMs and proposes progressive confidence shaping, a reinforcement learning objective that trains models to update confidence gradually. The method improves accuracy by up to 42 percentage points on Countdown and reduces flawed reasoning by 48 percentage points, with consistent gains across model sizes and tasks.

Long chains of thought (CoT) from current language models frequently contain logical gaps and unjustified leaps, limiting the gains from additional test-time compute. Improving reasoning quality directly would require process reward models, but the step-level annotations needed to train them are expensive and scarce. We find such a signal in how the model's confidence evolves during reasoning: premature confidence, the tendency to commit to an answer early and use the remaining tokens to rationalize it, strongly predicts flawed reasoning across tasks and model scales. We exploit this in progressive confidence shaping, a reinforcement learning objective that trains models to update their confidence as they reason rather than commit early -- rewarding gradual confidence growth and penalizing early commitment, with no external labels or reward models. The method improves accuracy and reasoning quality from 1.5B to 8B parameters across arithmetic (Countdown), math (DAPO, AIME), and science (ScienceQA): on Countdown, accuracy improves 3.2x (+42.0pp) and flawed reasoning drops 48pp; on AIME, Pass@64 improves 6.6pp. Consistent with this mechanism, the method also improves faithfulness: on a safety benchmark, our models more transparently surface misleading content in their reasoning traces rather than concealing it. Controlled experiments reveal that the problem and its remedy scale together: premature confidence grows with model size and task difficulty, and so do the gains from addressing it.

View on arXiv PDF

Similar