Examining LLMs' Uncertainty Expression Towards Questions Outside Parametric Knowledge
This addresses the problem of unreliable uncertainty expression in LLMs for users relying on their outputs, though it is incremental in nature.
The study investigated whether large language models (LLMs) can express uncertainty when faced with questions outside their parametric knowledge, finding that most LLMs fail to consistently refuse or express uncertainty, with only marginal improvements from fine-tuning and alignment techniques.
Can large language models (LLMs) express their uncertainty in situations where they lack sufficient parametric knowledge to generate reasonable responses? This work aims to systematically investigate LLMs' behaviors in such situations, emphasizing the trade-off between honesty and helpfulness. To tackle the challenge of precisely determining LLMs' knowledge gaps, we diagnostically create unanswerable questions containing non-existent concepts or false premises, ensuring that they are outside the LLMs' vast training data. By compiling a benchmark, UnknownBench, which consists of both unanswerable and answerable questions, we quantitatively evaluate the LLMs' performance in maintaining honesty while being helpful. Using a model-agnostic unified confidence elicitation approach, we observe that most LLMs fail to consistently refuse or express uncertainty towards questions outside their parametric knowledge, although instruction fine-tuning and alignment techniques can provide marginal enhancements. Moreover, LLMs' uncertainty expression does not always stay consistent with the perceived confidence of their textual outputs.