CVSep 7, 2023

Region Generation and Assessment Network for Occluded Person Re-Identification

Shuting He, Weihua Chen, Kai Wang, Hao Luo, Fan Wang, Wei Jiang, Henghui Ding

Stanford

arXiv:2309.03558v114.576 citationsh-index: 74

Originality Incremental advance

AI Analysis

This addresses performance degradation in Person Re-Identification for surveillance and security applications, offering a novel approach to handle occlusions without external tools, though it is incremental in improving existing methods.

The paper tackles the problem of misalignment and occlusions in Person Re-Identification by proposing a Region Generation and Assessment Network (RGANet) that detects human body regions and highlights important ones, achieving state-of-the-art results on six benchmarks across occluded, partial, and holistic tasks.

Person Re-identification (ReID) plays a more and more crucial role in recent years with a wide range of applications. Existing ReID methods are suffering from the challenges of misalignment and occlusions, which degrade the performance dramatically. Most methods tackle such challenges by utilizing external tools to locate body parts or exploiting matching strategies. Nevertheless, the inevitable domain gap between the datasets utilized for external tools and the ReID datasets and the complicated matching process make these methods unreliable and sensitive to noises. In this paper, we propose a Region Generation and Assessment Network (RGANet) to effectively and efficiently detect the human body regions and highlight the important regions. In the proposed RGANet, we first devise a Region Generation Module (RGM) which utilizes the pre-trained CLIP to locate the human body regions using semantic prototypes extracted from text descriptions. Learnable prompt is designed to eliminate domain gap between CLIP datasets and ReID datasets. Then, to measure the importance of each generated region, we introduce a Region Assessment Module (RAM) that assigns confidence scores to different regions and reduces the negative impact of the occlusion regions by lower scores. The RAM consists of a discrimination-aware indicator and an invariance-aware indicator, where the former indicates the capability to distinguish from different identities and the latter represents consistency among the images of the same class of human body regions. Extensive experimental results for six widely-used benchmarks including three tasks (occluded, partial, and holistic) demonstrate the superiority of RGANet against state-of-the-art methods.

View on arXiv PDF

Similar