CLFeb 17, 2025

Towards Fully Exploiting LLM Internal States to Enhance Knowledge Boundary Perception

Shiyu Ni, Keping Bi, Jiafeng Guo, Lulu Yu, Baolong Bi, Xueqi Cheng

arXiv:2502.11677v223.424 citationsh-index: 13ACL

Originality Incremental advance

AI Analysis

This work addresses the reliability issue of LLMs in practical applications by enhancing their ability to recognize knowledge gaps, though it is incremental as it builds on existing methods for confidence estimation.

This paper tackles the problem of large language models (LLMs) struggling to accurately gauge their knowledge boundaries, which leads to confident but incorrect responses. The results show that LLMs demonstrate significant pre-generation perception of confidence, and the proposed Confidence Consistency-based Calibration ($C^3$) improves the unknown perception rate by 5.6% on Natural Questions and 4.9% on HotpotQA.

Large language models (LLMs) exhibit impressive performance across diverse tasks but often struggle to accurately gauge their knowledge boundaries, leading to confident yet incorrect responses. This paper explores leveraging LLMs' internal states to enhance their perception of knowledge boundaries from efficiency and risk perspectives. We investigate whether LLMs can estimate their confidence using internal states before response generation, potentially saving computational resources. Our experiments on datasets like Natural Questions, HotpotQA, and MMLU reveal that LLMs demonstrate significant pre-generation perception, which is further refined post-generation, with perception gaps remaining stable across varying conditions. To mitigate risks in critical domains, we introduce Confidence Consistency-based Calibration ($C^3$), which assesses confidence consistency through question reformulation. $C^3$ significantly improves LLMs' ability to recognize their knowledge gaps, enhancing the unknown perception rate by 5.6% on NQ and 4.9% on HotpotQA. Our findings suggest that pre-generation confidence estimation can optimize efficiency, while $C^3$ effectively controls output risks, advancing the reliability of LLMs in practical applications.

View on arXiv PDF

Similar