Learning Compact Appearance Representation for Video-based Person Re-Identification
This addresses the problem of efficiently identifying individuals across video frames for surveillance or security applications, representing an incremental improvement in method design.
The paper tackles video-based person re-identification by selecting representative frames based on walking profiles and using a multiple CNN architecture with feature pooling to learn a compact appearance representation, achieving superior performance over existing methods on benchmark datasets.
This paper presents a novel approach for video-based person re-identification using multiple Convolutional Neural Networks (CNNs). Unlike previous work, we intend to extract a compact yet discriminative appearance representation from several frames rather than the whole sequence. Specifically, given a video, the representative frames are selected based on the walking profile of consecutive frames. A multiple CNN architecture incorporated with feature pooling is proposed to learn and compile the features of the selected representative frames into a compact description about the pedestrian for identification. Experiments are conducted on benchmark datasets to demonstrate the superiority of the proposed method over existing person re-identification approaches.