AILGFeb 3

EHRWorld: A Patient-Centric Medical World Model for Long-Horizon Clinical Trajectories

arXiv:2602.03569v12 citationsh-index: 11
Originality Incremental advance
AI Analysis

This work addresses the problem of reliable long-term clinical simulation for medical professionals, representing an incremental improvement by introducing a novel training paradigm and dataset.

The paper tackled the challenge of using large language models as dynamic medical world models for simulating disease progression and treatment outcomes, showing that naive LLM-based approaches struggle with error accumulation in long-horizon simulations. The result was EHRWorld, a patient-centric model trained on a large-scale clinical dataset, which significantly outperformed baselines in stability, event modeling, and efficiency.

World models offer a principled framework for simulating future states under interventions, but realizing such models in complex, high-stakes domains like medicine remains challenging. Recent large language models (LLMs) have achieved strong performance on static medical reasoning tasks, raising the question of whether they can function as dynamic medical world models capable of simulating disease progression and treatment outcomes over time. In this work, we show that LLMs only incorporating medical knowledge struggle to maintain consistent patient states under sequential interventions, leading to error accumulation in long-horizon clinical simulation. To address this limitation, we introduce EHRWorld, a patient-centric medical world model trained under a causal sequential paradigm, together with EHRWorld-110K, a large-scale longitudinal clinical dataset derived from real-world electronic health records. Extensive evaluations demonstrate that EHRWorld significantly outperforms naive LLM-based baselines, achieving more stable long-horizon simulation, improved modeling of clinically sensitive events, and favorable reasoning efficiency, highlighting the necessity of training on causally grounded, temporally evolving clinical data for reliable and robust medical world modeling.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes