CVAINEJun 28, 2017

Hierarchical Attentive Recurrent Tracking

arXiv:1706.09262v263 citations
AI Analysis

This addresses the challenge of single object tracking in videos for applications like surveillance or autonomous driving, but it appears incremental as it builds on existing attention mechanisms.

The paper tackles the problem of class-agnostic object tracking in cluttered environments by developing a hierarchical attentive recurrent model that uses spatial attention to suppress irrelevant features, achieving evaluation on pedestrian tracking in KTH and KITTI datasets.

Class-agnostic object tracking is particularly difficult in cluttered environments as target specific discriminative models cannot be learned a priori. Inspired by how the human visual cortex employs spatial attention and separate "where" and "what" processing pathways to actively suppress irrelevant visual features, this work develops a hierarchical attentive recurrent model for single object tracking in videos. The first layer of attention discards the majority of background by selecting a region containing the object of interest, while the subsequent layers tune in on visual features particular to the tracked object. This framework is fully differentiable and can be trained in a purely data driven fashion by gradient methods. To improve training convergence, we augment the loss function with terms for a number of auxiliary tasks relevant for tracking. Evaluation of the proposed model is performed on two datasets: pedestrian tracking on the KTH activity recognition dataset and the more difficult KITTI object tracking dataset.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes