CLAILGApr 25, 2024

Towards Adapting Open-Source Large Language Models for Expert-Level Clinical Note Generation

arXiv:2405.00715v613 citationsh-index: 11Has CodeACL
Originality Incremental advance
AI Analysis

This addresses the need for locally-hosted, privacy-preserving AI models in healthcare to generate expert-level clinical notes, though it is incremental as it builds on existing adaptation methods.

The study tackled the problem of generating high-quality clinical notes from patient-doctor dialogues by adapting the open-source LLaMA-2 model, resulting in LLaMA-Clinic, which achieved 92.8% acceptability ratings from physicians and matched physician-authored notes in real-world readiness for the 'Assessment and Plan' section.

Proprietary Large Language Models (LLMs) such as GPT-4 and Gemini have demonstrated promising capabilities in clinical text summarization tasks. However, due to patient data privacy concerns and computational costs, many healthcare providers prefer using small, locally-hosted models over external generic LLMs. This study presents a comprehensive domain- and task-specific adaptation process for the open-source LLaMA-2 13 billion parameter model, enabling it to generate high-quality clinical notes from outpatient patient-doctor dialogues. Our process incorporates continued pretraining, supervised fine-tuning, and reinforcement learning from both AI and human feedback. We introduced a new approach, DistillDirect, for performing on-policy reinforcement learning with Gemini 1.0 Pro as the teacher model. Our resulting model, LLaMA-Clinic, can generate clinical notes comparable in quality to those authored by physicians. In a blinded physician reader study, the majority (92.8%) of individual evaluations rated the notes generated by LLaMA-Clinic as "acceptable" or higher across three criteria: real-world readiness, completeness, and accuracy. In the more challenging "Assessment and Plan" section, LLaMA-Clinic matched physician-authored notes in real-world readiness score. We highlight key considerations for future clinical note-generation tasks, emphasizing the importance of pre-defining a "best practice" note format, rather than relying on LLMs to determine this for clinical practice.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes