CVOct 17, 2025

Memory-SAM: Human-Prompt-Free Tongue Segmentation via Retrieval-to-Prompt

Joongwon Chae, Lihui Luo, Xi Yuan, Dongmei Yu, Zhenglin Chen, Lian Zhang, Peiwu Qin

arXiv:2510.15849v13.6h-index: 6Has Code

Originality Incremental advance

AI Analysis

This work addresses the need for data-efficient and robust segmentation of irregular tongue boundaries in medical imaging, offering a human-prompt-free solution that is incremental over existing SAM-family models.

The paper tackled the problem of tongue segmentation for Traditional Chinese Medicine analysis by introducing Memory-SAM, a training-free method that automatically generates prompts from prior cases, achieving a mIoU of 0.9863 on a mixed test set of 600 images, outperforming baseline models.

Accurate tongue segmentation is crucial for reliable TCM analysis. Supervised models require large annotated datasets, while SAM-family models remain prompt-driven. We present Memory-SAM, a training-free, human-prompt-free pipeline that automatically generates effective prompts from a small memory of prior cases via dense DINOv3 features and FAISS retrieval. Given a query image, mask-constrained correspondences to the retrieved exemplar are distilled into foreground/background point prompts that guide SAM2 without manual clicks or model fine-tuning. We evaluate on 600 expert-annotated images (300 controlled, 300 in-the-wild). On the mixed test split, Memory-SAM achieves mIoU 0.9863, surpassing FCN (0.8188) and a detector-to-box SAM baseline (0.1839). On controlled data, ceiling effects above 0.98 make small differences less meaningful given annotation variability, while our method shows clear gains under real-world conditions. Results indicate that retrieval-to-prompt enables data-efficient, robust segmentation of irregular boundaries in tongue imaging. The code is publicly available at https://github.com/jw-chae/memory-sam.

View on arXiv PDF Code

Similar