CVJul 9, 2024

CycleSAM: Few-Shot Surgical Scene Segmentation with Cycle- and Scene-Consistent Feature Matching

arXiv:2407.06795v24 citationsh-index: 33
AI Analysis

This addresses the problem of domain adaptation for surgical scene segmentation, which is important for medical AI applications but represents an incremental improvement over existing few-shot segmentation methods.

The paper tackles the challenge of surgical image segmentation with limited annotated data by introducing CycleSAM, an improved visual prompt learning approach that outperforms existing few-shot SAM methods by 2-4x on four surgical datasets.

Surgical image segmentation is highly challenging, primarily due to scarcity of annotated data. Generalist prompted segmentation models like the Segment-Anything Model (SAM) can help tackle this task, but because they require image-specific visual prompts for effective performance, their use is limited to improving data annotation efficiency. Recent approaches extend SAM to automatic segmentation by using a few labeled reference images to predict point prompts; however, they rely on feature matching pipelines that lack robustness to out-of-domain data like surgical images. To tackle this problem, we introduce CycleSAM, an improved visual prompt learning approach that employs a data-efficient training phase and enforces a series of soft constraints to produce high-quality feature similarity maps. CycleSAM label-efficiently addresses domain gap by leveraging surgery-specific self-supervised feature extractors, then adapts the resulting features through a short parameter-efficient training stage, enabling it to produce informative similarity maps. CycleSAM further filters the similarity maps with a series of consistency constraints before robustly sampling diverse point prompts for each object instance. In our experiments on four diverse surgical datasets, we find that CycleSAM outperforms existing few-shot SAM approaches by a factor of 2-4x in both 1-shot and 5-shot settings, while also achieving strong performance gains over traditional linear probing, parameter-efficient adaptation, and pseudo-labeling methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes