CVMar 6

Match4Annotate: Propagating Sparse Video Annotations via Implicit Neural Feature Matching

Zhuorui Zhang, Roger Pallarès-López, Praneeth Namburi, Brian W. Anthony

arXiv:2603.06471v17.9h-index: 22

Predicted impact top 64% in CV · last 90 daysOriginality Incremental advance

AI Analysis

This addresses the bottleneck of costly expert labeling in domains such as medical imaging, offering a scalable annotation solution, though it appears incremental as it builds on existing feature matching and implicit neural representation techniques.

The paper tackles the problem of acquiring per-frame video annotations in specialized domains like medical imaging by proposing Match4Annotate, a lightweight framework for propagating point and mask annotations within and across videos, achieving state-of-the-art inter-video propagation on clinical ultrasound datasets.

Acquiring per-frame video annotations remains a primary bottleneck for deploying computer vision in specialized domains such as medical imaging, where expert labeling is slow and costly. Label propagation offers a natural solution, yet existing approaches face fundamental limitations. Video trackers and segmentation models can propagate labels within a single sequence but require per-video initialization and cannot generalize across videos. Classic correspondence pipelines operate on detector-chosen keypoints and struggle in low-texture scenes, while dense feature matching and one-shot segmentation methods enable cross-video propagation but lack spatiotemporal smoothness and unified support for both point and mask annotations. We present Match4Annotate, a lightweight framework for both intra-video and inter-video propagation of point and mask annotations. Our method fits a SIREN-based implicit neural representation to DINOv3 features at test time, producing a continuous, high-resolution spatiotemporal feature field, and learns a smooth implicit deformation field between frame pairs to guide correspondence matching. We evaluate on three challenging clinical ultrasound datasets. Match4Annotate achieves state-of-the-art inter-video propagation, outperforming feature matching and one-shot segmentation baselines, while remaining competitive with specialized trackers for intra-video propagation. Our results show that lightweight, test-time-optimized feature matching pipelines have the potential to offer an efficient and accessible solution for scalable annotation workflows.

View on arXiv PDF

Similar