CVMay 30, 2017

Robust Tracking Using Region Proposal Networks

arXiv:1705.10447v1
Originality Highly original
AI Analysis

This work addresses the challenge of generalizing deep learning features for visual tracking, offering a more efficient and robust solution for computer vision applications.

The paper tackled the problem of visual tracking by discovering that the top layer features of Region Proposal Networks can be used for robust tracking when unlocked with a novel loss function, achieving state-of-the-art results on benchmarks like OTB50, OTB100, and VOT2016 without ensemble or feature engineering.

Recent advances in visual tracking showed that deep Convolutional Neural Networks (CNN) trained for image classification can be strong feature extractors for discriminative trackers. However, due to the drastic difference between image classification and tracking, extra treatments such as model ensemble and feature engineering must be carried out to bridge the two domains. Such procedures are either time consuming or hard to generalize well across datasets. In this paper we discovered that the internal structure of Region Proposal Network (RPN)'s top layer feature can be utilized for robust visual tracking. We showed that such property has to be unleashed by a novel loss function which simultaneously considers classification accuracy and bounding box quality. Without ensemble and any extra treatment on feature maps, our proposed method achieved state-of-the-art results on several large scale benchmarks including OTB50, OTB100 and VOT2016. We will make our code publicly available.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes