Re-ID Driven Localization Refinement for Person Search
This addresses the person search problem in computer vision by improving localization for re-identification, though it appears incremental as it builds on existing detection and re-ID components.
The paper tackles the problem of person search where detection boxes from pedestrian detectors may be sub-optimal for re-identification, proposing a re-ID driven localization refinement framework that uses a differentiable ROI transform layer to refine boxes based on re-ID supervision. Experimental results show the method performs favorably against state-of-the-art methods on benchmarks.
Person search aims at localizing and identifying a query person from a gallery of uncropped scene images. Different from person re-identification (re-ID), its performance also depends on the localization accuracy of a pedestrian detector. The state-of-the-art methods train the detector individually, and the detected bounding boxes may be sub-optimal for the following re-ID task. To alleviate this issue, we propose a re-ID driven localization refinement framework for providing the refined detection boxes for person search. Specifically, we develop a differentiable ROI transform layer to effectively transform the bounding boxes from the original images. Thus, the box coordinates can be supervised by the re-ID training other than the original detection task. With this supervision, the detector can generate more reliable bounding boxes, and the downstream re-ID model can produce more discriminative embeddings based on the refined person localizations. Extensive experimental results on the widely used benchmarks demonstrate that our proposed method performs favorably against the state-of-the-art person search methods.