CLFeb 20, 2025

ALFA: Aligning LLMs to Ask Good Questions A Case Study in Clinical Reasoning

Shuyue Stella Li, Jimin Mun, Faeze Brahman, Pedram Hosseini, Bryceton G. Thomas, Jessica M. Sin, Bing Ren, Jonathan S. Ilgen, Yulia Tsvetkov, Maarten Sap

AI2CMU

arXiv:2502.14860v222.222 citationsh-index: 49Has Code

Originality Highly original

AI Analysis

This work addresses the issue of unreliable LLM question-asking for proactive information-gathering, particularly in clinical reasoning, with incremental improvements through fine-grained attribute alignment.

The paper tackles the problem of large language models (LLMs) failing to ask effective questions under uncertainty, which is critical for decision-making in domains like clinical reasoning, and presents ALFA, a framework that reduces diagnostic errors by 56.6% compared to state-of-the-art instruction-tuned LLMs.

Large language models (LLMs) often fail to ask effective questions under uncertainty, making them unreliable in domains where proactive information-gathering is essential for decision-making. We present ALignment via Fine-grained Attributes, (ALFA) a framework that improves LLM question-asking by (i) decomposing the notion of a "good" question into a set of theory-grounded attributes (e.g., clarity, relevance), (ii) controllably synthesizing attribute-specific question variations, and (iii) aligning models via preference-based optimization to explicitly learn to ask better questions along these fine-grained attributes. Focusing on clinical reasoning as a case study, we introduce the MediQ-AskDocs dataset, composed of 17k real-world clinical interactions augmented with 80k attribute-specific preference pairs of follow-up questions, as well as a novel expert-annotated interactive healthcare QA task to evaluate question-asking abilities. Models aligned with ALFA reduce diagnostic errors by 56.6% on MediQ-AskDocs compared to SoTA instruction-tuned LLMs, with a question-level win-rate of 64.4% and strong generalizability. Our findings suggest that explicitly guiding question-asking with structured, fine-grained attributes offers a scalable path to improve LLMs, especially in expert application domains.

View on arXiv PDF Code

Similar