CVAug 17, 2023

Identity-Seeking Self-Supervised Representation Learning for Generalizable Person Re-identification

Zhaopeng Dou, Zhongdao Wang, Yali Li, Shengjin Wang

arXiv:2308.08887v111.027 citationsh-index: 50Has Code

Originality Highly original

AI Analysis

It addresses the high annotation cost and data limitations in person re-identification for surveillance and security applications, offering a scalable solution with strong generalization.

This paper tackles the problem of learning domain-generalizable person re-identification representations without any annotation by proposing an identity-seeking self-supervised method, achieving 87.0% Rank-1 on Market-1501 and 56.4% on MSMT17 without fine-tuning, outperforming supervised methods.

This paper aims to learn a domain-generalizable (DG) person re-identification (ReID) representation from large-scale videos \textbf{without any annotation}. Prior DG ReID methods employ limited labeled data for training due to the high cost of annotation, which restricts further advances. To overcome the barriers of data and annotation, we propose to utilize large-scale unsupervised data for training. The key issue lies in how to mine identity information. To this end, we propose an Identity-seeking Self-supervised Representation learning (ISR) method. ISR constructs positive pairs from inter-frame images by modeling the instance association as a maximum-weight bipartite matching problem. A reliability-guided contrastive loss is further presented to suppress the adverse impact of noisy positive pairs, ensuring that reliable positive pairs dominate the learning process. The training cost of ISR scales approximately linearly with the data size, making it feasible to utilize large-scale data for training. The learned representation exhibits superior generalization ability. \textbf{Without human annotation and fine-tuning, ISR achieves 87.0\% Rank-1 on Market-1501 and 56.4\% Rank-1 on MSMT17}, outperforming the best supervised domain-generalizable method by 5.0\% and 19.5\%, respectively. In the pre-training$\rightarrow$fine-tuning scenario, ISR achieves state-of-the-art performance, with 88.4\% Rank-1 on MSMT17. The code is at \url{https://github.com/dcp15/ISR_ICCV2023_Oral}.

View on arXiv PDF Code

Similar