CVMar 15, 2020

Siamese Box Adaptive Network for Visual Tracking

arXiv:2003.06761v2829 citationsHas Code
AI Analysis

This addresses the need for more flexible and efficient visual tracking methods for applications like surveillance and robotics, though it is incremental as it builds on existing siamese network frameworks.

The paper tackles the problem of visual tracking by proposing Siamese Box Adaptive Network (SiamBAN), which avoids tedious configurations like multi-scale searching or anchor boxes, achieving state-of-the-art performance on benchmarks such as VOT2018 and VOT2019 with a speed of 40 FPS.

Most of the existing trackers usually rely on either a multi-scale searching scheme or pre-defined anchor boxes to accurately estimate the scale and aspect ratio of a target. Unfortunately, they typically call for tedious and heuristic configurations. To address this issue, we propose a simple yet effective visual tracking framework (named Siamese Box Adaptive Network, SiamBAN) by exploiting the expressive power of the fully convolutional network (FCN). SiamBAN views the visual tracking problem as a parallel classification and regression problem, and thus directly classifies objects and regresses their bounding boxes in a unified FCN. The no-prior box design avoids hyper-parameters associated with the candidate boxes, making SiamBAN more flexible and general. Extensive experiments on visual tracking benchmarks including VOT2018, VOT2019, OTB100, NFS, UAV123, and LaSOT demonstrate that SiamBAN achieves state-of-the-art performance and runs at 40 FPS, confirming its effectiveness and efficiency. The code will be available at https://github.com/hqucv/siamban.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes