CVDec 1, 2025

MV-TAP: Tracking Any Point in Multi-View Videos

Jahyeok Koo, Inès Hyeonsu Kim, Mungyeom Kim, Junghyun Park, Seohyun Park, Jaeyeong Kim, Jung Yi, Seokju Cho, Seungryong Kim

arXiv:2512.02006v13.61 citationsh-index: 13

Originality Incremental advance

AI Analysis

This addresses the need for reliable point tracking in multi-view camera systems for applications like scene understanding, but it appears incremental as it builds on existing point-tracking methods with a new multi-view approach.

The paper tackles the problem of tracking points across multi-view videos of dynamic scenes by introducing MV-TAP, a novel point tracker that leverages cross-view information, and it outperforms existing methods on challenging benchmarks.

Multi-view camera systems enable rich observations of complex real-world scenes, and understanding dynamic objects in multi-view settings has become central to various applications. In this work, we present MV-TAP, a novel point tracker that tracks points across multi-view videos of dynamic scenes by leveraging cross-view information. MV-TAP utilizes camera geometry and a cross-view attention mechanism to aggregate spatio-temporal information across views, enabling more complete and reliable trajectory estimation in multi-view videos. To support this task, we construct a large-scale synthetic training dataset and real-world evaluation sets tailored for multi-view tracking. Extensive experiments demonstrate that MV-TAP outperforms existing point-tracking methods on challenging benchmarks, establishing an effective baseline for advancing research in multi-view point tracking.

View on arXiv PDF

Similar