AI CL HCFeb 24, 2025

Improving Interactive Diagnostic Ability of a Large Language Model Agent Through Clinical Experience Learning

Zhoujian Sun, Ziyi Liu, Cheng Luo, Jiebin Chu, Zhengxing Huang

arXiv:2503.16463v19.63 citationsh-index: 6

Originality Incremental advance

AI Analysis

This addresses a critical bottleneck for developing autonomous diagnostic systems in healthcare, though it is incremental as it builds on existing LLM methods with specialized training.

The study tackled the problem of large language models (LLMs) underperforming in interactive medical diagnosis due to inefficiencies in initial information gathering, and it developed a plug-and-play method enhanced (PPME) LLM agent that achieved over 30% improvement in diagnostic accuracy compared to baselines, approaching levels comparable to using complete clinical data.

Recent advances in large language models (LLMs) have shown promising results in medical diagnosis, with some studies indicating superior performance compared to human physicians in specific scenarios. However, the diagnostic capabilities of LLMs are often overestimated, as their performance significantly deteriorates in interactive diagnostic settings that require active information gathering. This study investigates the underlying mechanisms behind the performance degradation phenomenon and proposes a solution. We identified that the primary deficiency of LLMs lies in the initial diagnosis phase, particularly in information-gathering efficiency and initial diagnosis formation, rather than in the subsequent differential diagnosis phase. To address this limitation, we developed a plug-and-play method enhanced (PPME) LLM agent, leveraging over 3.5 million electronic medical records from Chinese and American healthcare facilities. Our approach integrates specialized models for initial disease diagnosis and inquiry into the history of the present illness, trained through supervised and reinforcement learning techniques. The experimental results indicate that the PPME LLM achieved over 30% improvement compared to baselines. The final diagnostic accuracy of the PPME LLM in interactive diagnostic scenarios approached levels comparable to those achieved using complete clinical data. These findings suggest a promising potential for developing autonomous diagnostic systems, although further validation studies are needed.

View on arXiv PDF

Similar