What Patients Really Ask: Exploring the Effect of False Assumptions in Patient Information Seeking
This addresses a critical gap in healthcare AI by focusing on real-world patient queries, though it is incremental as it highlights a known bottleneck without proposing a new solution.
The study tackled the problem that large language models (LLMs) perform poorly on real patient questions, which often contain incorrect assumptions, by curating a dataset from Google's People Also Ask feature for top prescribed medications, showing that current LLMs struggle to identify these issues.
Patients are increasingly using large language models (LLMs) to seek answers to their healthcare-related questions. However, benchmarking efforts in LLMs for question answering often focus on medical exam questions, which differ significantly in style and content from the questions patients actually raise in real life. To bridge this gap, we sourced data from Google's People Also Ask feature by querying the top 200 prescribed medications in the United States, curating a dataset of medical questions people commonly ask. A considerable portion of the collected questions contains incorrect assumptions and dangerous intentions. We demonstrate that the emergence of these corrupted questions is not uniformly random and depends heavily on the degree of incorrectness in the history of questions that led to their appearance. Current LLMs that perform strongly on other benchmarks struggle to identify incorrect assumptions in everyday questions.