CVApr 7, 2022

PSTR: End-to-End One-Step Person Search With Transformers

arXiv:2204.03340v176 citationsh-index: 121Has Code
AI Analysis

This work addresses the challenge of efficiently locating and identifying specific individuals in images, which is incremental as it builds on existing transformer methods for person search.

The authors tackled the problem of person search by proposing PSTR, a one-step transformer-based framework that jointly performs person detection and re-identification, achieving a state-of-the-art mean average precision of 56.5% on the PRW benchmark.

We propose a novel one-step transformer-based person search framework, PSTR, that jointly performs person detection and re-identification (re-id) in a single architecture. PSTR comprises a person search-specialized (PSS) module that contains a detection encoder-decoder for person detection along with a discriminative re-id decoder for person re-id. The discriminative re-id decoder utilizes a multi-level supervision scheme with a shared decoder for discriminative re-id feature learning and also comprises a part attention block to encode relationship between different parts of a person. We further introduce a simple multi-scale scheme to support re-id across person instances at different scales. PSTR jointly achieves the diverse objectives of object-level recognition (detection) and instance-level matching (re-id). To the best of our knowledge, we are the first to propose an end-to-end one-step transformer-based person search framework. Experiments are performed on two popular benchmarks: CUHK-SYSU and PRW. Our extensive ablations reveal the merits of the proposed contributions. Further, the proposed PSTR sets a new state-of-the-art on both benchmarks. On the challenging PRW benchmark, PSTR achieves a mean average precision (mAP) score of 56.5%. The source code is available at \url{https://github.com/JialeCao001/PSTR}.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes