SynthFM: Training Modality-agnostic Foundation Models for Medical Image Segmentation without Real Medical Data
This addresses the costly and expertise-limited annotation of medical images for segmentation, though it is incremental as it builds on existing foundation models like SAM.
The paper tackled the problem of adapting foundation models for medical image segmentation without real annotated data by proposing SynthFM, a synthetic data generation framework, and achieved superior results compared to zero-shot baselines like SAM and MedSAM across 11 anatomical structures in 9 datasets.
Foundation models like the Segment Anything Model (SAM) excel in zero-shot segmentation for natural images but struggle with medical image segmentation due to differences in texture, contrast, and noise. Annotating medical images is costly and requires domain expertise, limiting large-scale annotated data availability. To address this, we propose SynthFM, a synthetic data generation framework that mimics the complexities of medical images, enabling foundation models to adapt without real medical data. Using SAM's pretrained encoder and training the decoder from scratch on SynthFM's dataset, we evaluated our method on 11 anatomical structures across 9 datasets (CT, MRI, and Ultrasound). SynthFM outperformed zero-shot baselines like SAM and MedSAM, achieving superior results under different prompt settings and on out-of-distribution datasets.