CLSep 8, 2023

Retrieving Evidence from EHRs with LLMs: Possibilities and Challenges

AmazonSalesforce
arXiv:2309.04550v328 citationsh-index: 52
Originality Incremental advance
AI Analysis

This addresses the challenge of manually sifting through large volumes of EHR notes for radiologists, though it is incremental as it builds on existing LLM capabilities for a specific domain.

The authors tackled the problem of efficiently retrieving and summarizing unstructured evidence from Electronic Health Records (EHRs) for radiologists' diagnoses, proposing a zero-shot LLM-based approach that was consistently preferred over a pre-LLM baseline in expert evaluations.

Unstructured data in Electronic Health Records (EHRs) often contains critical information -- complementary to imaging -- that could inform radiologists' diagnoses. But the large volume of notes often associated with patients together with time constraints renders manually identifying relevant evidence practically infeasible. In this work we propose and evaluate a zero-shot strategy for using LLMs as a mechanism to efficiently retrieve and summarize unstructured evidence in patient EHR relevant to a given query. Our method entails tasking an LLM to infer whether a patient has, or is at risk of, a particular condition on the basis of associated notes; if so, we ask the model to summarize the supporting evidence. Under expert evaluation, we find that this LLM-based approach provides outputs consistently preferred to a pre-LLM information retrieval baseline. Manual evaluation is expensive, so we also propose and validate a method using an LLM to evaluate (other) LLM outputs for this task, allowing us to scale up evaluation. Our findings indicate the promise of LLMs as interfaces to EHR, but also highlight the outstanding challenge posed by "hallucinations". In this setting, however, we show that model confidence in outputs strongly correlates with faithful summaries, offering a practical means to limit confabulations.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes