Evaluation of AI Chatbots for Patient-Specific EHR Questions
This addresses the problem of efficient information retrieval from electronic health records for healthcare professionals, but it is incremental as it applies existing methods to a new domain.
This paper investigated the use of AI chatbots like ChatGPT, Google Bard, and Claude for answering patient-specific questions from clinical notes, evaluating their accuracy, relevance, comprehensiveness, and coherence on a Likert scale.
This paper investigates the use of artificial intelligence chatbots for patient-specific question answering (QA) from clinical notes using several large language model (LLM) based systems: ChatGPT (versions 3.5 and 4), Google Bard, and Claude. We evaluate the accuracy, relevance, comprehensiveness, and coherence of the answers generated by each model using a 5-point Likert scale on a set of patient-specific questions.