Re-Identification with Consistent Attentive Siamese Networks
This addresses key unsolved problems in person re-identification for surveillance and security applications, but it appears incremental as it builds on existing Siamese and attention methods.
The paper tackles spatial localization and view-invariant representation learning for person re-identification by proposing a Consistent Attentive Siamese Network, achieving competitive performance on datasets like CUHK03-NP, DukeMTMC-ReID, and Market-1501.
We propose a new deep architecture for person re-identification (re-id). While re-id has seen much recent progress, spatial localization and view-invariant representation learning for robust cross-view matching remain key, unsolved problems. We address these questions by means of a new attention-driven Siamese learning architecture, called the Consistent Attentive Siamese Network. Our key innovations compared to existing, competing methods include (a) a flexible framework design that produces attention with only identity labels as supervision, (b) explicit mechanisms to enforce attention consistency among images of the same person, and (c) a new Siamese framework that integrates attention and attention consistency, producing principled supervisory signals as well as the first mechanism that can explain the reasoning behind the Siamese framework's predictions. We conduct extensive evaluations on the CUHK03-NP, DukeMTMC-ReID, and Market-1501 datasets and report competitive performance.