Synthetic-To-Real Video Person Re-ID
This work addresses the problem of reducing data acquisition costs for person re-identification in public security applications, presenting an incremental improvement through domain adaptation techniques.
The paper tackles cross-domain video-based person re-identification by training on synthetic video datasets and testing on real-world videos, reducing reliance on expensive real data. It achieves validated results across five real datasets, with the surprising finding that synthetic data can outperform real data in this scenario.
Person re-identification (Re-ID) is an important task and has significant applications for public security and information forensics, which has progressed rapidly with the development of deep learning. In this work, we investigate a novel and challenging setting of Re-ID, i.e., cross-domain video-based person Re-ID. Specifically, we utilize synthetic video datasets as the source domain for training and real-world videos for testing, notably reducing the reliance on expensive real data acquisition and annotation. To harness the potential of synthetic data, we first propose a self-supervised domain-invariant feature learning strategy for both static and dynamic (temporal) features. Additionally, to enhance person identification accuracy in the target domain, we propose a mean-teacher scheme incorporating a self-supervised ID consistency loss. Experimental results across five real datasets validate the rationale behind cross-synthetic-real domain adaptation and demonstrate the efficacy of our method. Notably, the discovery that synthetic data outperforms real data in the cross-domain scenario is a surprising outcome. The code and data are publicly available at https://github.com/XiangqunZhang/UDA_Video_ReID.