AIDec 21, 2025

Reflective Confidence: Correcting Reasoning Flaws via Online Self-Correction

arXiv:2512.18605v13.3h-index: 12

Originality Incremental advance

AI Analysis

This addresses the computational overhead problem in LLM reasoning for researchers and practitioners, offering an incremental improvement over early-stopping methods.

The paper tackles the inefficiency of ensemble-based reasoning in large language models by proposing reflective confidence, a framework that uses low-confidence signals to trigger self-correction instead of discarding incomplete paths, resulting in significant accuracy improvements on mathematical reasoning benchmarks like AIME 2025 at comparable computational cost.

Large language models (LLMs) have achieved strong performance on complex reasoning tasks using techniques such as chain-of-thought and self-consistency. However, ensemble-based approaches, especially self-consistency which relies on multiple reasoning trajectories, often incur substantial computational overhead. To improve efficiency, prior work has leveraged internal confidence signals, where early stopping strategies such as DeepConf reduce cost by terminating low-confidence trajectories. However, this strategy discards incomplete reasoning paths and wastes partial computation. We propose reflective confidence, a novel reasoning framework that transforms low-confidence signals from termination indicators into reflection triggers. When confidence falls below a threshold, instead of stopping generation, the model produces a reflection prompt to analyze the current reasoning state, identify potential errors, and continue generation along a corrected trajectory. Experiments on mathematical reasoning benchmarks, including AIME 2025, demonstrate significant accuracy improvements over advanced early-stopping baselines at comparable computational cost, validating the effectiveness of proactive self-correction over passive discarding.

View on arXiv PDF

Similar