Camera-Conditioned Stable Feature Generation for Isolated Camera Supervised Person Re-IDentification
This addresses the challenge of person re-identification in surveillance systems with isolated cameras, where cross-view data is scarce, by proposing a novel feature generation method, though it is incremental as it builds on existing generative models.
The paper tackled the problem of learning camera-view invariant features for person re-identification under isolated camera supervision, where cross-camera training samples are unavailable, by introducing a pipeline that synthesizes cross-camera samples in feature space using a novel method called Camera-Conditioned Stable Feature Generation (CCSFG). The result demonstrated superiority over competitors on two ISCS datasets, with extensive experiments validating the approach.
To learn camera-view invariant features for person Re-IDentification (Re-ID), the cross-camera image pairs of each person play an important role. However, such cross-view training samples could be unavailable under the ISolated Camera Supervised (ISCS) setting, e.g., a surveillance system deployed across distant scenes. To handle this challenging problem, a new pipeline is introduced by synthesizing the cross-camera samples in the feature space for model training. Specifically, the feature encoder and generator are end-to-end optimized under a novel method, Camera-Conditioned Stable Feature Generation (CCSFG). Its joint learning procedure raises concern on the stability of generative model training. Therefore, a new feature generator, $σ$-Regularized Conditional Variational Autoencoder ($σ$-Reg.~CVAE), is proposed with theoretical and experimental analysis on its robustness. Extensive experiments on two ISCS person Re-ID datasets demonstrate the superiority of our CCSFG to the competitors.