CL AIDec 4, 2023

Fine-tuning pre-trained extractive QA models for clinical document parsing

Ashwyn Sharma, David I. Feldman, Aneesh Jain

arXiv:2312.02314v10.51 citationsh-index: 4

Originality Synthesis-oriented

AI Analysis

This work addresses the need for efficient parsing of clinical documents to support remote patient monitoring, but it is incremental as it applies fine-tuning of existing QA models to a specific domain.

The paper tackled the problem of extracting clinical markers like ejection fraction from unstructured echocardiogram reports to identify eligible heart failure patients for remote monitoring programs, resulting in over 1500 hours saved for clinicians over 12 months by automating the task.

Electronic health records (EHRs) contain a vast amount of high-dimensional multi-modal data that can accurately represent a patient's medical history. Unfortunately, most of this data is either unstructured or semi-structured, rendering it unsuitable for real-time and retrospective analyses. A remote patient monitoring (RPM) program for Heart Failure (HF) patients needs to have access to clinical markers like EF (Ejection Fraction) or LVEF (Left Ventricular Ejection Fraction) in order to ascertain eligibility and appropriateness for the program. This paper explains a system that can parse echocardiogram reports and verify EF values. This system helps identify eligible HF patients who can be enrolled in such a program. At the heart of this system is a pre-trained extractive QA transformer model that is fine-tuned on custom-labeled data. The methods used to prepare such a model for deployment are illustrated by running experiments on a public clinical dataset like MIMIC-IV-Note. The pipeline can be used to generalize solutions to similar problems in a low-resource setting. We found that the system saved over 1500 hours for our clinicians over 12 months by automating the task at scale.

View on arXiv PDF

Similar