CLNov 18, 2025

Don't Miss the Forest for the Trees: In-Depth Confidence Estimation for LLMs via Reasoning over the Answer Space

arXiv:2511.14275v12 citations
Originality Incremental advance
AI Analysis

This work addresses the need for reliable confidence estimation in LLM applications, offering an incremental improvement over existing verbalized confidence methods.

The paper tackles the problem of improving confidence estimation in large language models by requiring them to predict a verbalized probability distribution over the answer space, which encourages deeper reasoning. This method shows advantages across different models and tasks, maintaining effectiveness even after reinforcement learning.

Knowing the reliability of a model's response is essential in application. With the strong generation capabilities of LLMs, research has focused on generating verbalized confidence. This is further enhanced by combining chain-of-thought reasoning, which provides logical and transparent estimation. However, how reasoning strategies affect the estimated confidence is still under-explored. In this work, we demonstrate that predicting a verbalized probability distribution can effectively encourage in-depth reasoning for confidence estimation. Intuitively, it requires an LLM to consider all candidates within the answer space instead of basing on a single guess, and to carefully assign confidence scores to meet the requirements of a distribution. This method shows an advantage across different models and various tasks, regardless of whether the answer space is known. Its advantage is maintained even after reinforcement learning, and further analysis shows its reasoning patterns are aligned with human expectations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes