CVNov 27, 2017

Hierarchical Spatial-aware Siamese Network for Thermal Infrared Object Tracking

arXiv:1711.09539v2131 citations
Originality Incremental advance
AI Analysis

This addresses the problem of accurate object tracking in thermal infrared imagery for applications like surveillance, though it is incremental as it adapts existing Siamese network concepts to a specific domain.

The paper tackles the misalignment between classification objectives and tracking goals in thermal infrared (TIR) object tracking by framing it as a similarity verification task, proposing HSSNet, a hierarchical spatial-aware Siamese CNN, which achieves favorable performance on VOT-TIR 2015 and VOT-TIR 2016 benchmarks compared to state-of-the-art methods.

Most thermal infrared (TIR) tracking methods are discriminative, treating the tracking problem as a classification task. However, the objective of the classifier (label prediction) is not coupled to the objective of the tracker (location estimation). The classification task focuses on the between-class difference of the arbitrary objects, while the tracking task mainly deals with the within-class difference of the same objects. In this paper, we cast the TIR tracking problem as a similarity verification task, which is coupled well to the objective of the tracking task. We propose a TIR tracker via a Hierarchical Spatial-aware Siamese Convolutional Neural Network (CNN), named HSSNet. To obtain both spatial and semantic features of the TIR object, we design a Siamese CNN that coalesces the multiple hierarchical convolutional layers. Then, we propose a spatial-aware network to enhance the discriminative ability of the coalesced hierarchical feature. Subsequently, we train this network end to end on a large visible video detection dataset to learn the similarity between paired objects before we transfer the network into the TIR domain. Next, this pre-trained Siamese network is used to evaluate the similarity between the target template and target candidates. Finally, we locate the candidate that is most similar to the tracked target. Extensive experimental results on the benchmarks VOT-TIR 2015 and VOT-TIR 2016 show that our proposed method achieves favourable performance compared to the state-of-the-art methods.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes