AIAPNov 20, 2023

ChatGPT and post-test probability

arXiv:2311.12188v5h-index: 5
Originality Synthesis-oriented
AI Analysis

This work addresses the reliability of AI in healthcare diagnostics, highlighting limitations in formal reasoning tasks, which is incremental as it builds on existing critiques of LLM sensitivity and specificity.

The study evaluated ChatGPT's ability to perform probabilistic medical diagnostic reasoning, such as updating pre-test to post-test probabilities using Bayes' rule, and found that introducing medical terminology increased error rates, though prompt engineering could partially mitigate these issues.

Reinforcement learning-based large language models, such as ChatGPT, are believed to have potential to aid human experts in many domains, including healthcare. There is, however, little work on ChatGPT's ability to perform a key task in healthcare: formal, probabilistic medical diagnostic reasoning. This type of reasoning is used, for example, to update a pre-test probability to a post-test probability. In this work, we probe ChatGPT's ability to perform this task. In particular, we ask ChatGPT to give examples of how to use Bayes rule for medical diagnosis. Our prompts range from queries that use terminology from pure probability (e.g., requests for a posterior of A given B and C) to queries that use terminology from medical diagnosis (e.g., requests for a posterior probability of Covid given a test result and cough). We show how the introduction of medical variable names leads to an increase in the number of errors that ChatGPT makes. Given our results, we also show how one can use prompt engineering to facilitate ChatGPT's partial avoidance of these errors. We discuss our results in light of recent commentaries on sensitivity and specificity. We also discuss how our results might inform new research directions for large language models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes