Learning Compact Target-Oriented Feature Representations for Visual Tracking
This work addresses the need for more efficient and discriminative feature representations in visual tracking, an incremental advancement for computer vision applications.
The paper tackled the problem of redundant and noisy deep features in visual tracking by proposing a novel approach that learns compact, target-oriented feature representations using Laplacian coding within a discriminative correlation filter framework, resulting in clear performance improvements over baseline trackers with minimal impact on frame rate.
Many state-of-the-art trackers usually resort to the pretrained convolutional neural network (CNN) model for correlation filtering, in which deep features could usually be redundant, noisy and less discriminative for some certain instances, and the tracking performance might thus be affected. To handle this problem, we propose a novel approach, which takes both advantages of good generalization of generative models and excellent discrimination of discriminative models, for visual tracking. In particular, we learn compact, discriminative and target-oriented feature representations using the Laplacian coding algorithm that exploits the dependence among the input local features in a discriminative correlation filter framework. The feature representations and the correlation filter are jointly learnt to enhance to each other via a fast solver which only has very slight computational burden on the tracking speed. Extensive experiments on three benchmark datasets demonstrate that this proposed framework clearly outperforms baseline trackers with a modest impact on the frame rate, and performs comparably against the state-of-the-art methods.