Causal-Discovery Performance of ChatGPT in the context of Neuropathic Pain Diagnosis
This work addresses the problem of evaluating AI models for causal reasoning in medical applications, but it is incremental as it applies an existing method to new data.
The study tested ChatGPT's ability to answer causal discovery questions using a medical benchmark for neuropathic pain diagnosis, finding it demonstrated exceptional proficiency in this new context.
ChatGPT has demonstrated exceptional proficiency in natural language conversation, e.g., it can answer a wide range of questions while no previous large language models can. Thus, we would like to push its limit and explore its ability to answer causal discovery questions by using a medical benchmark (Tu et al. 2019) in causal discovery.