SeqSAM: Autoregressive Multiple Hypothesis Prediction for Medical Image Segmentation using SAM
This addresses uncertainty in medical image segmentation for clinicians, but is incremental as it builds on existing pre-trained models and multiple choice learning techniques.
The paper tackles the problem of generating multiple segmentation masks for medical images to capture uncertainty, introducing SeqSAM which uses a sequential approach with bipartite matching loss to produce an arbitrary number of clinically relevant masks, showing notable improvements in mask quality on two public datasets.
Pre-trained segmentation models are a powerful and flexible tool for segmenting images. Recently, this trend has extended to medical imaging. Yet, often these methods only produce a single prediction for a given image, neglecting inherent uncertainty in medical images, due to unclear object boundaries and errors caused by the annotation tool. Multiple Choice Learning is a technique for generating multiple masks, through multiple learned prediction heads. However, this cannot readily be extended to producing more outputs than its initial pre-training hyperparameters, as the sparse, winner-takes-all loss function makes it easy for one prediction head to become overly dominant, thus not guaranteeing the clinical relevancy of each mask produced. We introduce SeqSAM, a sequential, RNN-inspired approach to generating multiple masks, which uses a bipartite matching loss for ensuring the clinical relevancy of each mask, and can produce an arbitrary number of masks. We show notable improvements in quality of each mask produced across two publicly available datasets. Our code is available at https://github.com/BenjaminTowle/SeqSAM.