Segment anything model for head and neck tumor segmentation with CT, PET and MRI multi-modality images
This work addresses the challenge of multi-modality image integration for tumor segmentation in head and neck cancer, representing an incremental improvement over existing methods.
This study tackled the problem of automatic gross tumor volume segmentation in head and neck cancer using the Segment Anything Model (SAM) with CT, PET, and MRI images, finding that fine-tuning SAM significantly improved segmentation accuracy over zero-shot results with bounding box prompts.
Deep learning presents novel opportunities for the auto-segmentation of gross tumor volume (GTV) in head and neck cancer (HNC), yet fully automatic methods usually necessitate significant manual refinement. This study investigates the Segment Anything Model (SAM), recognized for requiring minimal human prompting and its zero-shot generalization ability across natural images. We specifically examine MedSAM, a version of SAM fine-tuned with large-scale public medical images. Despite its progress, the integration of multi-modality images (CT, PET, MRI) for effective GTV delineation remains a challenge. Focusing on SAM's application in HNC GTV segmentation, we assess its performance in both zero-shot and fine-tuned scenarios using single (CT-only) and fused multi-modality images. Our study demonstrates that fine-tuning SAM significantly enhances its segmentation accuracy, building upon the already effective zero-shot results achieved with bounding box prompts. These findings open a promising avenue for semi-automatic HNC GTV segmentation.