Solution for Point Tracking Task of ICCV 1st Perception Test Challenge 2023
This work addresses incremental improvements in point tracking accuracy for computer vision applications, specifically in static camera scenarios.
The paper tackled the problem of cumulative error in Tracking Any Point (TAP) tasks by proposing TAPIR+, a method that rectifies tracking for static points in videos shot by static cameras, achieving a first-place score of 0.46 in the ICCV 2023 challenge.
This report proposes an improved method for the Tracking Any Point (TAP) task, which tracks any physical surface through a video. Several existing approaches have explored the TAP by considering the temporal relationships to obtain smooth point motion trajectories, however, they still suffer from the cumulative error caused by temporal prediction. To address this issue, we propose a simple yet effective approach called TAP with confident static points (TAPIR+), which focuses on rectifying the tracking of the static point in the videos shot by a static camera. To clarify, our approach contains two key components: (1) Multi-granularity Camera Motion Detection, which could identify the video sequence by the static camera shot. (2) CMR-based point trajectory prediction with one moving object segmentation approach to isolate the static point from the moving object. Our approach ranked first in the final test with a score of 0.46.