CVNov 17, 2022

MOTRv2: Bootstrapping End-to-End Multi-Object Tracking by Pretrained Object Detectors

arXiv:2211.09791v2270 citationsh-index: 27Has Code
Originality Incremental advance
AI Analysis

This work addresses the detection bottleneck in end-to-end multi-object tracking for applications like group dance analysis and autonomous driving, representing an incremental improvement over prior methods.

The paper tackles the problem of poor detection performance in end-to-end multi-object tracking methods by proposing MOTRv2, which incorporates a pretrained object detector to generate proposals as anchors, achieving state-of-the-art results with 73.4% HOTA on DanceTrack and strong performance on BDD100K.

In this paper, we propose MOTRv2, a simple yet effective pipeline to bootstrap end-to-end multi-object tracking with a pretrained object detector. Existing end-to-end methods, MOTR and TrackFormer are inferior to their tracking-by-detection counterparts mainly due to their poor detection performance. We aim to improve MOTR by elegantly incorporating an extra object detector. We first adopt the anchor formulation of queries and then use an extra object detector to generate proposals as anchors, providing detection prior to MOTR. The simple modification greatly eases the conflict between joint learning detection and association tasks in MOTR. MOTRv2 keeps the query propogation feature and scales well on large-scale benchmarks. MOTRv2 ranks the 1st place (73.4% HOTA on DanceTrack) in the 1st Multiple People Tracking in Group Dance Challenge. Moreover, MOTRv2 reaches state-of-the-art performance on the BDD100K dataset. We hope this simple and effective pipeline can provide some new insights to the end-to-end MOT community. Code is available at \url{https://github.com/megvii-research/MOTRv2}.

Code Implementations4 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes