CLFeb 2, 2025

Universal Abstraction: Harnessing Frontier Models to Structure Real-World Data at Scale

Microsoft
arXiv:2502.00943v25 citationsh-index: 32
Originality Incremental advance
AI Analysis

This addresses the scalability challenge in medical abstraction for applications like registry curation and clinical trials, though it is incremental as it builds on existing frontier models.

The paper tackles the problem of extracting structured information from unstructured clinical text by introducing UniMedAbstractor (UMA), a framework that uses frontier large language models for zero-shot medical abstraction, eliminating the need for attribute-specific training or rules. It shows that UMA matches or exceeds state-of-the-art attribute-specific methods in oncology, with performance based on GPT-4o.

A significant fraction of real-world patient information resides in unstructured clinical text. Medical abstraction extracts and normalizes key structured attributes from free-text clinical notes, which is the prerequisite for a variety of important downstream applications, including registry curation, clinical trial operations, and real-world evidence generation. Prior medical abstraction methods typically resort to building attribute-specific models, each of which requires extensive manual effort such as rule creation or supervised label annotation for the individual attribute, thus limiting scalability. In this paper, we show that existing frontier models already possess the universal abstraction capability for scaling medical abstraction to a wide range of clinical attributes. We present UniMedAbstractor (UMA), a unifying framework for zero-shot medical abstraction with a modular, customizable prompt template and the selection of any frontier large language models. Given a new attribute for abstraction, users only need to conduct lightweight prompt adaptation in UMA to adjust the specification in natural languages. Compared to traditional methods, UMA eliminates the need for attribute-specific training labels or handcrafted rules, thus substantially reducing the development time and cost. We conducted a comprehensive evaluation of UMA in oncology using a wide range of marquee attributes representing the cancer patient journey. These include relatively simple attributes typically specified within a single clinical note (e.g. performance status), as well as complex attributes requiring sophisticated reasoning across multiple notes at various time points (e.g. tumor staging). Based on a single frontier model such as GPT-4o, UMA matched or even exceeded the performance of state-of-the-art attribute-specific methods, each of which was tailored to the individual attribute.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes