CardioBench: Do Echocardiography Foundation Models Generalize Beyond the Lab?
This work addresses the problem of inconsistent and non-reproducible benchmarking for echocardiography AI models, which is incremental but important for researchers and clinicians in medical imaging.
The authors tackled the lack of standardized evaluation for echocardiography foundation models by introducing CardioBench, a benchmark that unifies eight public datasets across nine tasks, revealing that temporal modeling and retrieval are critical for performance, with general-purpose encoders showing strong transfer but limitations in fine-grained tasks.
Foundation models (FMs) are reshaping medical imaging, yet their application in echocardiography remains limited. While several echocardiography-specific FMs have recently been introduced, no standardized benchmark exists to evaluate them. Echocardiography poses unique challenges, including noisy acquisitions, high frame redundancy, and limited public datasets. Most existing solutions evaluate on private data, restricting comparability. To address this, we introduce CardioBench, a comprehensive benchmark for echocardiography FMs. CardioBench unifies eight publicly available datasets into a standardized suite spanning four regression and five classification tasks, covering functional, structural, diagnostic, and view recognition endpoints. We evaluate several leading FM, including cardiac-specific, biomedical, and general-purpose encoders, under consistent zero-shot, probing, and alignment protocols. Our results highlight complementary strengths across model families: temporal modeling is critical for functional regression, retrieval provides robustness under distribution shift, and domain-specific text encoders capture physiologically meaningful axes. General-purpose encoders transfer strongly and often close the gap with probing, but struggle with fine-grained distinctions like view classification and subtle pathology recognition. By releasing preprocessing, splits, and public evaluation pipelines, CardioBench establishes a reproducible reference point and offers actionable insights to guide the design of future echocardiography foundation models.