Are foundation models efficient for medical image segmentation?
This work addresses the efficiency of foundation models for medical image segmentation, showing they are less effective than specialized methods in this domain.
The study compared the Segment Anything Model (SAM) to a modality-specific self-supervised learning method on cardiac ultrasound segmentation, finding that SAM performed poorly with higher labeling and computing costs.
Foundation models are experiencing a surge in popularity. The Segment Anything model (SAM) asserts an ability to segment a wide spectrum of objects but required supervised training at unprecedented scale. We compared SAM's performance (against clinical ground truth) and resources (labeling time, compute) to a modality-specific, label-free self-supervised learning (SSL) method on 25 measurements for 100 cardiac ultrasounds. SAM performed poorly and required significantly more labeling and computing resources, demonstrating worse efficiency than SSL.