CVSep 11, 2020

Devil's in the Details: Aligning Visual Clues for Conditional Embedding in Person Re-Identification

Fufu Yu, Xinyang Jiang, Yifei Gong, Shizhen Zhao, Xiaowei Guo, Wei-Shi Zheng, Feng Zheng, Xing Sun

arXiv:2009.05250v24.22 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the problem of accurate person matching in surveillance and security applications, representing an incremental improvement over existing methods.

The paper tackles the challenge of matching detailed visual information in Person Re-Identification under conditions like occlusion and viewpoint changes by proposing a method that aligns decisive region pairs and adjusts features conditionally, achieving state-of-the-art performance on three public datasets.

Although Person Re-Identification has made impressive progress, difficult cases like occlusion, change of view-pointand similar clothing still bring great challenges. Besides overall visual features, matching and comparing detailed information is also essential for tackling these challenges. This paper proposes two key recognition patterns to better utilize the detail information of pedestrian images, that most of the existing methods are unable to satisfy. Firstly, Visual Clue Alignment requires the model to select and align decisive regions pairs from two images for pair-wise comparison, while existing methods only align regions with predefined rules like high feature similarity or same semantic labels. Secondly, the Conditional Feature Embedding requires the overall feature of a query image to be dynamically adjusted based on the gallery image it matches, while most of the existing methods ignore the reference images. By introducing novel techniques including correspondence attention module and discrepancy-based GCN, we propose an end-to-end ReID method that integrates both patterns into a unified framework, called CACE-Net((C)lue(A)lignment and (C)onditional (E)mbedding). The experiments show that CACE-Net achieves state-of-the-art performance on three public datasets.

View on arXiv PDF Code

Similar