The 1st-place Solution for ECCV 2022 Multiple People Tracking in Group Dance Challenge
This work addresses tracking challenges in crowded dance scenes, but it is incremental as it builds on existing MOTR with specific modifications.
The paper tackled the problem of multiple people tracking in group dance videos by proposing an enhanced transformer-based method, achieving 73.4% HOTA on the DanceTrack test set and surpassing the second-place solution by 6.8% HOTA.
We present our 1st place solution to the Group Dance Multiple People Tracking Challenge. Based on MOTR: End-to-End Multiple-Object Tracking with Transformer, we explore: 1) detect queries as anchors, 2) tracking as query denoising, 3) joint training on pseudo video clips generated from CrowdHuman dataset, and 4) using the YOLOX detection proposals for the anchor initialization of detect queries. Our method achieves 73.4% HOTA on the DanceTrack test set, surpassing the second-place solution by +6.8% HOTA.