Permutation-invariant Feature Restructuring for Correlation-aware Image Set-based Recognition
This addresses the challenge of image set-based recognition for applications like security and surveillance, though it appears incremental as it builds on existing attention and dictionary learning methods.
The paper tackles the problem of comparing image sets with variable quantity, quality, and unordered images by using feature restructuring to exploit correlations, achieving top performance on competitive benchmarks for face recognition and person re-identification.
We consider the problem of comparing the similarity of image sets with variable-quantity, quality and un-ordered heterogeneous images. We use feature restructuring to exploit the correlations of both inner$\&$inter-set images. Specifically, the residual self-attention can effectively restructure the features using the other features within a set to emphasize the discriminative images and eliminate the redundancy. Then, a sparse/collaborative learning-based dependency-guided representation scheme reconstructs the probe features conditional to the gallery features in order to adaptively align the two sets. This enables our framework to be compatible with both verification and open-set identification. We show that the parametric self-attention network and non-parametric dictionary learning can be trained end-to-end by a unified alternative optimization scheme, and that the full framework is permutation-invariant. In the numerical experiments we conducted, our method achieves top performance on competitive image set/video-based face recognition and person re-identification benchmarks.