CLOct 13, 2025

ADVICE: Answer-Dependent Verbalized Confidence Estimation

arXiv:2510.10913v13 citationsh-index: 22
Originality Incremental advance
AI Analysis

This addresses the issue of unreliable confidence estimates in LLMs for users needing transparency, though it is incremental as it builds on existing verbalization methods.

The paper tackled the problem of overconfidence in large language models' verbalized confidence by identifying answer-independence as a key factor and proposing ADVICE, a fine-tuning framework that improves confidence calibration while preserving task performance.

Recent progress in large language models (LLMs) has enabled them to express their confidence in natural language, enhancing transparency and reliability. However, their confidence often exhibits overconfidence, the cause of which remains poorly understood. In this work, we conduct a detailed analysis of the dynamics underlying verbalized confidence and identify answer-independence as a key factor, defined as the model's failure to condition confidence on its own answer. To address this, we propose ADVICE (Answer-Dependent Verbalized Confidence Estimation), a fine-tuning framework that facilitates answer-grounded confidence estimation. Extensive experiments show that ADVICE substantially improves confidence calibration while preserving task performance. Further analyses confirm that ADVICE strengthens answer-groundedness, leading to more balanced and well-calibrated confidence distributions. Our findings shed light on the origin of overconfidence and establish a framework for more trustworthy confidence verbalization.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes