CL AI HCAug 13, 2023

Diagnostic Reasoning Prompts Reveal the Potential for Large Language Model Interpretability in Medicine

Thomas Savage, Ashwin Nayak, Robert Gallo, Ekanath Rangan, Jonathan H Chen

arXiv:2308.06834v117.4268 citationsh-index: 30

Originality Incremental advance

AI Analysis

This addresses the trust barrier for physicians in using LLMs for patient care by making their decision-making more interpretable.

The study tackled the problem of LLMs' uninterpretability in medicine by developing diagnostic reasoning prompts to test if they can mimic clinical reasoning; they found GPT-4 could do so without losing diagnostic accuracy.

One of the major barriers to using large language models (LLMs) in medicine is the perception they use uninterpretable methods to make clinical decisions that are inherently different from the cognitive processes of clinicians. In this manuscript we develop novel diagnostic reasoning prompts to study whether LLMs can perform clinical reasoning to accurately form a diagnosis. We find that GPT4 can be prompted to mimic the common clinical reasoning processes of clinicians without sacrificing diagnostic accuracy. This is significant because an LLM that can use clinical reasoning to provide an interpretable rationale offers physicians a means to evaluate whether LLMs can be trusted for patient care. Novel prompting methods have the potential to expose the black box of LLMs, bringing them one step closer to safe and effective use in medicine.

View on arXiv PDF

Similar