EyeWorld: A Generative World Model of Ocular State and Dynamics
This addresses the challenge of robust multimodal interpretation and prognosis-oriented simulation in ophthalmology, representing a novel approach rather than an incremental improvement.
The researchers tackled the problem of static medical foundation models that degrade under modality and acquisition shifts in ophthalmology by introducing EyeWorld, a generative world model that learns an observation-stable latent ocular state across modalities, enabling fine-grained parsing, cross-modality translation, and quality-robust enhancement while supporting forecasting of clinically meaningful progression.
Ophthalmic decision-making depends on subtle lesion-scale cues interpreted across multimodal imaging and over time, yet most medical foundation models remain static and degrade under modality and acquisition shifts. Here we introduce EyeWorld, a generative world model that conceptualizes the eye as a partially observed dynamical system grounded in clinical imaging. EyeWorld learns an observation-stable latent ocular state shared across modalities, unifying fine-grained parsing, structure-preserving cross-modality translation and quality-robust enhancement within a single framework. Longitudinal supervision further enables time-conditioned state transitions, supporting forecasting of clinically meaningful progression while preserving stable anatomy. By moving from static representation learning to explicit dynamical modeling, EyeWorld provides a unified approach to robust multimodal interpretation and prognosis-oriented simulation in medicine.