CVAIApr 3, 2025

Systematic Evaluation of Large Vision-Language Models for Surgical Artificial Intelligence

DeepMindStanford
arXiv:2504.02799v13 citationsh-index: 19
Originality Incremental advance
AI Analysis

This work addresses the uncertainty of VLMs' utility in surgery, a domain with scarce expert-annotated data and variable clinical scenarios, offering insights for clinical and broader real-world applications.

The paper systematically evaluated 11 large vision-language models across 17 surgical AI tasks using 13 datasets, finding that VLMs showed promising generalizability, sometimes outperforming supervised models outside their training settings, with in-context learning boosting performance up to three-fold.

Large Vision-Language Models offer a new paradigm for AI-driven image understanding, enabling models to perform tasks without task-specific training. This flexibility holds particular promise across medicine, where expert-annotated data is scarce. Yet, VLMs' practical utility in intervention-focused domains--especially surgery, where decision-making is subjective and clinical scenarios are variable--remains uncertain. Here, we present a comprehensive analysis of 11 state-of-the-art VLMs across 17 key visual understanding tasks in surgical AI--from anatomy recognition to skill assessment--using 13 datasets spanning laparoscopic, robotic, and open procedures. In our experiments, VLMs demonstrate promising generalizability, at times outperforming supervised models when deployed outside their training setting. In-context learning, incorporating examples during testing, boosted performance up to three-fold, suggesting adaptability as a key strength. Still, tasks requiring spatial or temporal reasoning remained difficult. Beyond surgery, our findings offer insights into VLMs' potential for tackling complex and dynamic scenarios in clinical and broader real-world applications.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes