Language models are susceptible to incorrect patient self-diagnosis in medical applications
This highlights a critical safety issue for deploying LLMs in healthcare, as it exposes risks in real patient-doctor interactions, though it is incremental in focusing on a specific vulnerability.
The study tackled the problem of large language models (LLMs) being vulnerable to incorrect patient self-diagnosis in medical applications, finding that diagnostic accuracy drops dramatically when patients provide bias-validating information.
Large language models (LLMs) are becoming increasingly relevant as a potential tool for healthcare, aiding communication between clinicians, researchers, and patients. However, traditional evaluations of LLMs on medical exam questions do not reflect the complexity of real patient-doctor interactions. An example of this complexity is the introduction of patient self-diagnosis, where a patient attempts to diagnose their own medical conditions from various sources. While the patient sometimes arrives at an accurate conclusion, they more often are led toward misdiagnosis due to the patient's over-emphasis on bias validating information. In this work we present a variety of LLMs with multiple-choice questions from United States medical board exams which are modified to include self-diagnostic reports from patients. Our findings highlight that when a patient proposes incorrect bias-validating information, the diagnostic accuracy of LLMs drop dramatically, revealing a high susceptibility to errors in self-diagnosis.