CVApr 16, 2023

A Novel end-to-end Framework for Occluded Pixel Reconstruction with Spatio-temporal Features for Improved Person Re-identification

Prathistith Raj Medi, Ghanta Sai Krishna, Praneeth Nemani, Satyanarayana Vollala, Santosh Kumar

arXiv:2304.07721v13.94 citationsh-index: 6

Originality Highly original

AI Analysis

This addresses a critical bottleneck for real-life surveillance systems where occlusion reduces performance.

The paper tackles the problem of person re-identification under occlusion by developing an end-to-end framework that detects and reconstructs occluded pixels using deep neural networks, achieving exceptional Rank-1 accuracy on multiple datasets.

Person re-identification is vital for monitoring and tracking crowd movement to enhance public security. However, re-identification in the presence of occlusion substantially reduces the performance of existing systems and is a challenging area. In this work, we propose a plausible solution to this problem by developing effective occlusion detection and reconstruction framework for RGB images/videos consisting of Deep Neural Networks. Specifically, a CNN-based occlusion detection model classifies individual input frames, followed by a Conv-LSTM and Autoencoder to reconstruct the occluded pixels corresponding to the occluded frames for sequential (video) and non-sequential (image) data, respectively. The quality of the reconstructed RGB frames is further refined and fine-tuned using a Conditional Generative Adversarial Network (cGAN). Our method is evaluated on four well-known public data sets of the domain, and the qualitative reconstruction results are indeed appealing. Quantitative evaluation in terms of re-identification accuracy of the Siamese network showed an exceptional Rank-1 accuracy after occluded pixel reconstruction on various datasets. A comparative analysis with state-of-the-art approaches also demonstrates the robustness of our work for use in real-life surveillance systems.

View on arXiv PDF

Similar