PieTrack: An MOT solution based on synthetic data training and self-supervised domain adaptation
This addresses privacy and labeling costs in human tracking for computer vision applications, but it is incremental as it builds on existing synthetic data approaches.
The paper tackles the problem of domain shift between synthetic and real data in multi-object tracking by proposing a self-supervised domain adaptation method, achieving a HOTA score of 58.7 on MOT17 and ranking third in a challenge.
In order to cope with the increasing demand for labeling data and privacy issues with human detection, synthetic data has been used as a substitute and showing promising results in human detection and tracking tasks. We participate in the 7th Workshop on Benchmarking Multi-Target Tracking (BMTT), themed on "How Far Can Synthetic Data Take us"? Our solution, PieTrack, is developed based on synthetic data without using any pre-trained weights. We propose a self-supervised domain adaptation method that enables mitigating the domain shift issue between the synthetic (e.g., MOTSynth) and real data (e.g., MOT17) without involving extra human labels. By leveraging the proposed multi-scale ensemble inference, we achieved a final HOTA score of 58.7 on the MOT17 testing set, ranked third place in the challenge.