CVApr 30, 2018

Deep Co-attention based Comparators For Relative Representation Learning in Person Re-identification

Lin Wu, Yang Wang, Junbin Gao, Dacheng Tao

arXiv:1804.11027v16.821 citations

Originality Incremental advance

AI Analysis

This work improves person re-identification for surveillance and security applications by enabling more flexible and discriminant representations, though it is incremental as it builds on existing attention and comparator frameworks.

The paper tackles the problem of person re-identification by addressing the independent detection of relevant image parts and reliance on spatial manipulation in existing methods, introducing Deep Co-attention based Comparators that fuse co-dependent representations to focus on relevant regions and achieve state-of-the-art performance on three benchmark datasets.

Person re-identification (re-ID) requires rapid, flexible yet discriminant representations to quickly generalize to unseen observations on-the-fly and recognize the same identity across disjoint camera views. Recent effective methods are developed in a pair-wise similarity learning system to detect a fixed set of features from distinct regions which are mapped to their vector embeddings for the distance measuring. However, the most relevant and crucial parts of each image are detected independently without referring to the dependency conditioned on one and another. Also, these region based methods rely on spatial manipulation to position the local features in comparable similarity measuring. To combat these limitations, in this paper we introduce the Deep Co-attention based Comparators (DCCs) that fuse the co-dependent representations of the paired images so as to focus on the relevant parts of both images and produce their \textit{relative representations}. Given a pair of pedestrian images to be compared, the proposed model mimics the foveation of human eyes to detect distinct regions concurrent on both images, namely co-dependent features, and alternatively attend to relevant regions to fuse them into the similarity learning. Our comparator is capable of producing dynamic representations relative to a particular sample every time, and thus well-suited to the case of re-identifying pedestrians on-the-fly. We perform extensive experiments to provide the insights and demonstrate the effectiveness of the proposed DCCs in person re-ID. Moreover, our approach has achieved the state-of-the-art performance on three benchmark data sets: DukeMTMC-reID \cite{DukeMTMC}, CUHK03 \cite{FPNN}, and Market-1501 \cite{Market1501}.

View on arXiv PDF

Similar