CVOct 26, 2018

Video-based Person Re-identification Using Spatial-Temporal Attention Networks

arXiv:1810.11261v110 citations
Originality Incremental advance
AI Analysis

This addresses the problem of identifying individuals across different camera videos for surveillance applications, but it is incremental as it builds on existing attention-based methods.

The paper tackles video-based person re-identification by proposing a spatial-temporal attention network that uses attention scores to weight frame features, achieving state-of-the-art performance on two benchmark datasets.

We consider the problem of video-based person re-identification. The goal is to identify a person from videos captured under different cameras. In this paper, we propose an efficient spatial-temporal attention based model for person re-identification from videos. Our method generates an attention score for each frame based on frame-level features. The attention scores of all frames in a video are used to produce a weighted feature vector for the input video. Unlike most existing deep learning methods that use global representation, our approach focuses on attention scores. Extensive experiments on two benchmark datasets demonstrate that our method achieves the state-of-the-art performance. This is a technical report.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes