CLMay 29, 2025

UAQFact: Evaluating Factual Knowledge Utilization of LLMs on Unanswerable Questions

Chuanyuan Tan, Wenbiao Shao, Hao Xiong, Tong Zhu, Zhenhua Liu, Kai Shi, Wenliang Chen

arXiv:2505.23461v16.72 citationsh-index: 13Has CodeACL

Originality Synthesis-oriented

AI Analysis

This work addresses the challenge of preventing misleading responses in LLMs for unanswerable questions, though it is incremental as it builds on existing datasets by adding factual knowledge support.

The authors tackled the problem of evaluating how LLMs use factual knowledge when handling unanswerable questions by introducing a new bilingual dataset, UAQFact, derived from a Knowledge Graph, and found that LLMs struggle to consistently perform well even with access to factual knowledge, with performance varying across models.

Handling unanswerable questions (UAQ) is crucial for LLMs, as it helps prevent misleading responses in complex situations. While previous studies have built several datasets to assess LLMs' performance on UAQ, these datasets lack factual knowledge support, which limits the evaluation of LLMs' ability to utilize their factual knowledge when handling UAQ. To address the limitation, we introduce a new unanswerable question dataset UAQFact, a bilingual dataset with auxiliary factual knowledge created from a Knowledge Graph. Based on UAQFact, we further define two new tasks to measure LLMs' ability to utilize internal and external factual knowledge, respectively. Our experimental results across multiple LLM series show that UAQFact presents significant challenges, as LLMs do not consistently perform well even when they have factual knowledge stored. Additionally, we find that incorporating external knowledge may enhance performance, but LLMs still cannot make full use of the knowledge which may result in incorrect responses.

View on arXiv PDF Code

Similar