CLAIFeb 20, 2024

AgentMD: Empowering Language Agents for Risk Prediction with Large-Scale Clinical Tool Learning

arXiv:2402.13225v142 citationsh-index: 42
Originality Incremental advance
AI Analysis

This addresses usability and efficiency issues for healthcare professionals by automating clinical calculator curation and application, though it is incremental as it builds on existing language agent and tool-learning paradigms.

The authors tackled the problem of limited usability and scalability of clinical calculators in healthcare by introducing AgentMD, a language agent that automatically curates and applies clinical calculators, achieving 87.7% accuracy on a new benchmark compared to 40.9% for GPT-4.

Clinical calculators play a vital role in healthcare by offering accurate evidence-based predictions for various purposes such as prognosis. Nevertheless, their widespread utilization is frequently hindered by usability challenges, poor dissemination, and restricted functionality. Augmenting large language models with extensive collections of clinical calculators presents an opportunity to overcome these obstacles and improve workflow efficiency, but the scalability of the manual curation process poses a significant challenge. In response, we introduce AgentMD, a novel language agent capable of curating and applying clinical calculators across various clinical contexts. Using the published literature, AgentMD has automatically curated a collection of 2,164 diverse clinical calculators with executable functions and structured documentation, collectively named RiskCalcs. Manual evaluations show that RiskCalcs tools achieve an accuracy of over 80% on three quality metrics. At inference time, AgentMD can automatically select and apply the relevant RiskCalcs tools given any patient description. On the newly established RiskQA benchmark, AgentMD significantly outperforms chain-of-thought prompting with GPT-4 (87.7% vs. 40.9% in accuracy). Additionally, we also applied AgentMD to real-world clinical notes for analyzing both population-level and risk-level patient characteristics. In summary, our study illustrates the utility of language agents augmented with clinical calculators for healthcare analytics and patient care.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes