CLJun 28, 2024

EHRmonize: A Framework for Medical Concept Abstraction from Electronic Health Records using Large Language Models

João Matos, Jack Gallifant, Jian Pei, A. Ian Wong

arXiv:2407.00242v13.48 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the costly and expertise-intensive problem of EHR data harmonization for clinicians and healthcare researchers, though it is incremental as it applies existing LLMs to a new domain.

The researchers tackled the challenge of abstracting medical concepts from electronic health records by developing EHRmonize, a framework using large language models, which achieved up to 100% accuracy in binary classification tasks and reduced annotation time by an estimated 60%.

Electronic health records (EHRs) contain vast amounts of complex data, but harmonizing and processing this information remains a challenging and costly task requiring significant clinical expertise. While large language models (LLMs) have shown promise in various healthcare applications, their potential for abstracting medical concepts from EHRs remains largely unexplored. We introduce EHRmonize, a framework leveraging LLMs to abstract medical concepts from EHR data. Our study uses medication data from two real-world EHR databases to evaluate five LLMs on two free-text extraction and six binary classification tasks across various prompting strategies. GPT-4o's with 10-shot prompting achieved the highest performance in all tasks, accompanied by Claude-3.5-Sonnet in a subset of tasks. GPT-4o achieved an accuracy of 97% in identifying generic route names, 82% for generic drug names, and 100% in performing binary classification of antibiotics. While EHRmonize significantly enhances efficiency, reducing annotation time by an estimated 60%, we emphasize that clinician oversight remains essential. Our framework, available as a Python package, offers a promising tool to assist clinicians in EHR data abstraction, potentially accelerating healthcare research and improving data harmonization processes.

View on arXiv PDF Code

Similar