CLMay 24

Knowing but Not Showing: LLMs Recognize Ambiguity but Rarely Ask Clarifying Questions

arXiv:2605.2528422.2
Predicted impact top 56% in CL · last 90 daysOriginality Synthesis-oriented
AI Analysis

For developers of conversational AI, this reveals a critical behavioral gap between ambiguity recognition and clarification-seeking, highlighting a limitation in current models' helpfulness.

LLMs recognize ambiguity when explicitly asked but rarely ask clarifying questions in standard QA, defaulting to direct answers even for ambiguous queries; retrieved context widens this gap by improving answerability.

User queries are often underspecified and may admit multiple valid interpretations. Rather than silently making assumptions about the user's intent, a helpful assistant should surface such ambiguity by asking a clarifying question. Doing so requires two abilities: recognizing that a query is ambiguous, and acting on that recognition by seeking clarification instead of answering directly. To study these abilities, we evaluate models on ambiguous, unambiguous, and disambiguated questions in three settings: standard question answering, explicit ambiguity judgment, and behavioral analysis, where a judge model classifies responses as direct answers, refusals, or clarifying questions. We find a clear gap between recognition and behavior: models often identify ambiguity when explicitly asked to judge it, yet in the QA setting they overwhelmingly default to direct answers. Retrieved context further widens this gap by improving answerability while making models even less likely to ask clarifying questions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes