Long-term Tracking in the Wild: A Benchmark
This addresses the disparity between short-term academic benchmarks and practical needs for tracking in real-world scenarios, providing a large-scale resource for the computer vision community.
The authors tackled the lack of long-term tracking benchmarks by introducing the OxUvA dataset with 366 sequences spanning 14 hours, featuring average lengths over two minutes and frequent target disappearance, and evaluated algorithms on target location and presence detection.
We introduce the OxUvA dataset and benchmark for evaluating single-object tracking algorithms. Benchmarks have enabled great strides in the field of object tracking by defining standardized evaluations on large sets of diverse videos. However, these works have focused exclusively on sequences that are just tens of seconds in length and in which the target is always visible. Consequently, most researchers have designed methods tailored to this "short-term" scenario, which is poorly representative of practitioners' needs. Aiming to address this disparity, we compile a long-term, large-scale tracking dataset of sequences with average length greater than two minutes and with frequent target object disappearance. The OxUvA dataset is much larger than the object tracking datasets of recent years: it comprises 366 sequences spanning 14 hours of video. We assess the performance of several algorithms, considering both the ability to locate the target and to determine whether it is present or absent. Our goal is to offer the community a large and diverse benchmark to enable the design and evaluation of tracking methods ready to be used "in the wild". The project website is http://oxuva.net