CVMay 9, 2025

You Are Your Best Teacher: Semi-Supervised Surgical Point Tracking with Cycle-Consistent Self-Distillation

arXiv:2505.05722v1h-index: 5
Originality Incremental advance
AI Analysis

This work addresses the challenge of robust point tracking in surgical videos, a high-shift and data-scarce domain, with an incremental improvement over existing adaptation methods.

The paper tackles the problem of adapting synthetic-trained point trackers to surgical videos, which suffer from domain shift and lack of labeled data, by proposing SurgTracker, a semi-supervised framework that uses filtered self-distillation with cycle consistency, achieving improved tracking performance on the STIR benchmark using only 80 unlabeled videos.

Synthetic datasets have enabled significant progress in point tracking by providing large-scale, densely annotated supervision. However, deploying these models in real-world domains remains challenging due to domain shift and lack of labeled data-issues that are especially severe in surgical videos, where scenes exhibit complex tissue deformation, occlusion, and lighting variation. While recent approaches adapt synthetic-trained trackers to natural videos using teacher ensembles or augmentation-heavy pseudo-labeling pipelines, their effectiveness in high-shift domains like surgery remains unexplored. This work presents SurgTracker, a semi-supervised framework for adapting synthetic-trained point trackers to surgical video using filtered self-distillation. Pseudo-labels are generated online by a fixed teacher-identical in architecture and initialization to the student-and are filtered using a cycle consistency constraint to discard temporally inconsistent trajectories. This simple yet effective design enforces geometric consistency and provides stable supervision throughout training, without the computational overhead of maintaining multiple teachers. Experiments on the STIR benchmark show that SurgTracker improves tracking performance using only 80 unlabeled videos, demonstrating its potential for robust adaptation in high-shift, data-scarce domains.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes