IVCVMar 31, 2025

AI-Assisted Colonoscopy: Polyp Detection and Segmentation using Foundation Models

arXiv:2503.24138v12 citationsh-index: 9Has Code
Originality Incremental advance
AI Analysis

This addresses the problem of missed polyps in colonoscopy for medical practitioners, but it is incremental as it focuses on optimizing existing foundation models rather than introducing a new approach.

The study evaluated foundation models for polyp detection and segmentation in colonoscopy, finding that domain-specific models or fine-tuned generic models outperformed state-of-the-art benchmarks, with some achieving superior zero-shot performance on unseen data.

In colonoscopy, 80% of the missed polyps could be detected with the help of Deep Learning models. In the search for algorithms capable of addressing this challenge, foundation models emerge as promising candidates. Their zero-shot or few-shot learning capabilities, facilitate generalization to new data or tasks without extensive fine-tuning. A concept that is particularly advantageous in the medical imaging domain, where large annotated datasets for traditional training are scarce. In this context, a comprehensive evaluation of foundation models for polyp segmentation was conducted, assessing both detection and delimitation. For the study, three different colonoscopy datasets have been employed to compare the performance of five different foundation models, DINOv2, YOLO-World, GroundingDINO, SAM and MedSAM, against two benchmark networks, YOLOv8 and Mask R-CNN. Results show that the success of foundation models in polyp characterization is highly dependent on domain specialization. For optimal performance in medical applications, domain-specific models are essential, and generic models require fine-tuning to achieve effective results. Through this specialization, foundation models demonstrated superior performance compared to state-of-the-art detection and segmentation models, with some models even excelling in zero-shot evaluation; outperforming fine-tuned models on unseen data.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes