CL AI IR LGJan 23, 2025

MedSlice: Fine-Tuned Large Language Models for Secure Clinical Note Sectioning

Joshua Davis, Thomas Sounack, Kate Sciacca, Jessie M Brain, Brigitte N Durieux, Nicole D Agaronnik, Charlotta Lindvall

arXiv:2501.14105v16.73 citationsh-index: 35Has CodeJAMIA Open

Originality Incremental advance

AI Analysis

This addresses the need for privacy-preserving and cost-effective clinical data processing, though it is incremental as it applies existing fine-tuning methods to a specific domain.

The study tackled the problem of automated clinical note sectioning by fine-tuning open-source large language models, achieving an F1 score of 0.92 that outperformed proprietary models like GPT-4o.

Extracting sections from clinical notes is crucial for downstream analysis but is challenging due to variability in formatting and labor-intensive nature of manual sectioning. While proprietary large language models (LLMs) have shown promise, privacy concerns limit their accessibility. This study develops a pipeline for automated note sectioning using open-source LLMs, focusing on three sections: History of Present Illness, Interval History, and Assessment and Plan. We fine-tuned three open-source LLMs to extract sections using a curated dataset of 487 progress notes, comparing results relative to proprietary models (GPT-4o, GPT-4o mini). Internal and external validity were assessed via precision, recall and F1 score. Fine-tuned Llama 3.1 8B outperformed GPT-4o (F1=0.92). On the external validity test set, performance remained high (F1= 0.85). Fine-tuned open-source LLMs can surpass proprietary models in clinical note sectioning, offering advantages in cost, performance, and accessibility.

View on arXiv PDF Code

Similar