MTRL-SCI CVApr 9, 2024

SAM-I-Am: Semantic Boosting for Zero-shot Atomic-Scale Electron Micrograph Segmentation

Waqwoya Abebe, Jan Strube, Luanzheng Guo, Nathan R. Tallent, Oceane Bel, Steven Spurgeon, Christina Doty, Ali Jannesari

arXiv:2404.06638v25.115 citationsh-index: 21Has CodeComput mater sci

Originality Incremental advance

AI Analysis

This work addresses the challenge of costly and infeasible fine-tuning in domains like electron microscopy, offering a rapid adaptation method for segmentation tasks.

The paper tackled the problem of adapting zero-shot foundation models for domain-specific image segmentation, particularly in atomic-scale electron micrograph segmentation, by proposing semantic boosting, which improved mean IoU by up to +21.35% and reduced false positive masks by up to -18.42% compared to vanilla SAM.

Image segmentation is a critical enabler for tasks ranging from medical diagnostics to autonomous driving. However, the correct segmentation semantics - where are boundaries located? what segments are logically similar? - change depending on the domain, such that state-of-the-art foundation models can generate meaningless and incorrect results. Moreover, in certain domains, fine-tuning and retraining techniques are infeasible: obtaining labels is costly and time-consuming; domain images (micrographs) can be exponentially diverse; and data sharing (for third-party retraining) is restricted. To enable rapid adaptation of the best segmentation technology, we propose the concept of semantic boosting: given a zero-shot foundation model, guide its segmentation and adjust results to match domain expectations. We apply semantic boosting to the Segment Anything Model (SAM) to obtain microstructure segmentation for transmission electron microscopy. Our booster, SAM-I-Am, extracts geometric and textural features of various intermediate masks to perform mask removal and mask merging operations. We demonstrate a zero-shot performance increase of (absolute) +21.35%, +12.6%, +5.27% in mean IoU, and a -9.91%, -18.42%, -4.06% drop in mean false positive masks across images of three difficulty classes over vanilla SAM (ViT-L).

View on arXiv PDF Code

Similar