Foundation Models in Medical Imaging: A Review and Outlook
This review addresses the problem of reducing reliance on manual annotations in medical imaging for researchers and practitioners, but it is incremental as it synthesizes existing studies rather than presenting new findings.
The paper reviews how foundation models are transforming medical image analysis by learning from large unlabeled datasets and adapting to specific clinical tasks with minimal supervision, based on evidence from over 150 studies across pathology, radiology, and ophthalmology.
Foundation models (FMs) are changing the way medical images are analyzed by learning from large collections of unlabeled data. Instead of relying on manually annotated examples, FMs are pre-trained to learn general-purpose visual features that can later be adapted to specific clinical tasks with little additional supervision. In this review, we examine how FMs are being developed and applied in pathology, radiology, and ophthalmology, drawing on evidence from over 150 studies. We explain the core components of FM pipelines, including model architectures, self-supervised learning methods, and strategies for downstream adaptation. We also review how FMs are being used in each imaging domain and compare design choices across applications. Finally, we discuss key challenges and open questions to guide future research.