CVROMar 23

A vision-language model and platform for temporally mapping surgery from video

arXiv:2603.225834.6h-index: 13
Predicted impact top 81% in CV · last 90 daysOriginality Incremental advance
AI Analysis

This work addresses the challenge of making surgical AI accessible and useful for practicing surgeons worldwide, though it appears incremental as it builds on existing vision-language models for a specific domain.

The authors tackled the problem of mapping surgical procedures from video to develop guidelines and enable autonomous robotic surgery, by introducing Halsted, a vision-language model trained on a comprehensive annotated video library, which surpasses previous state-of-the-art models in mapping surgical activity with greater comprehensiveness and computational efficiency.

Mapping surgery is fundamental to developing operative guidelines and enabling autonomous robotic surgery. Recent advances in artificial intelligence (AI) have shown promise in mapping the behaviour of surgeons from videos, yet current models remain narrow in scope, capturing limited behavioural components within single procedures, and offer limited translational value, as they remain inaccessible to practising surgeons. Here we introduce Halsted, a vision-language model trained on the Halsted Surgical Atlas (HSA), one of the most comprehensive annotated video libraries grown through an iterative self-labelling framework and encompassing over 650,000 videos across eight surgical specialties. To facilitate benchmarking, we publicly release HSA-27k, a subset of the Halsted Surgical Atlas. Halsted surpasses previous state-of-the-art models in mapping surgical activity while offering greater comprehensiveness and computational efficiency. To bridge the longstanding translational gap of surgical AI, we develop the Halsted web platform (https://halstedhealth.ai/) to provide surgeons anywhere in the world with the previously-unavailable capability of automatically mapping their own procedures within minutes. By standardizing unstructured surgical video data and making these capabilities directly accessible to surgeons, our work brings surgical AI closer to clinical deployment and helps pave the way toward autonomous robotic surgery.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes