Transferring a Semantic Representation for Person Re-Identification and Search
This addresses the scalability issue in person re-identification for surveillance by reducing the need for extensive domain-specific annotations.
The paper tackles the problem of person re-identification and search by developing a semantic attribute learning approach that transfers knowledge from fashion photography datasets to surveillance domains without requiring surveillance-specific annotations, achieving state-of-the-art and near state-of-the-art performance in unsupervised and supervised settings.
Learning semantic attributes for person re-identification and description-based person search has gained increasing interest due to attributes' great potential as a pose and view-invariant representation. However, existing attribute-centric approaches have thus far underperformed state-of-the-art conventional approaches. This is due to their non-scalable need for extensive domain (camera) specific annotation. In this paper we present a new semantic attribute learning approach for person re-identification and search. Our model is trained on existing fashion photography datasets -- either weakly or strongly labelled. It can then be transferred and adapted to provide a powerful semantic description of surveillance person detections, without requiring any surveillance domain supervision. The resulting representation is useful for both unsupervised and supervised person re-identification, achieving state-of-the-art and near state-of-the-art performance respectively. Furthermore, as a semantic representation it allows description-based person search to be integrated within the same framework.