Multi-tracklet Tracking for Generic Targets with Adaptive Detection Clustering
This work addresses tracking challenges for unseen categories in vision-based systems, representing an incremental improvement over existing methods.
The paper tackles the problem of tracking unseen object categories in real-world scenarios by proposing a Multi-Tracklet Tracking (MTT) framework that integrates adaptive detection clustering and multi-tracklet association, demonstrating competitiveness on a generic multiple object tracking benchmark.
Tracking specific targets, such as pedestrians and vehicles, has been the focus of recent vision-based multitarget tracking studies. However, in some real-world scenarios, unseen categories often challenge existing methods due to low-confidence detections, weak motion and appearance constraints, and long-term occlusions. To address these issues, this article proposes a tracklet-enhanced tracker called Multi-Tracklet Tracking (MTT) that integrates flexible tracklet generation into a multi-tracklet association framework. This framework first adaptively clusters the detection results according to their short-term spatio-temporal correlation into robust tracklets and then estimates the best tracklet partitions using multiple clues, such as location and appearance over time to mitigate error propagation in long-term association. Finally, extensive experiments on the benchmark for generic multiple object tracking demonstrate the competitiveness of the proposed framework.