CVOct 15, 2023

Staged Depthwise Correlation and Feature Fusion for Siamese Object Tracking

arXiv:2310.09747v11 citationsh-index: 20
Originality Incremental advance
AI Analysis

This work addresses the challenge of optimizing feature extraction for visual tracking, which is important for applications like surveillance and robotics, but it appears incremental as it builds upon existing siamese network architectures.

The paper tackles the problem of feature extraction in visual object tracking by proposing DCFFNet, a siamese network with a depthwise correlation and feature fusion module, which achieves competitive performance on benchmarks like OTB100, VOT2018, and LaSOT while meeting real-time requirements.

In this work, we propose a novel staged depthwise correlation and feature fusion network, named DCFFNet, to further optimize the feature extraction for visual tracking. We build our deep tracker upon a siamese network architecture, which is offline trained from scratch on multiple large-scale datasets in an end-to-end manner. The model contains a core component, that is, depthwise correlation and feature fusion module (correlation-fusion module), which facilitates model to learn a set of optimal weights for a specific object by utilizing ensembles of multi-level features from lower and higher layers and multi-channel semantics on the same layer. We combine the modified ResNet-50 with the proposed correlation-fusion layer to constitute the feature extractor of our model. In training process, we find the training of model become more stable, that benifits from the correlation-fusion module. For comprehensive evaluations of performance, we implement our tracker on the popular benchmarks, including OTB100, VOT2018 and LaSOT. Extensive experiment results demonstrate that our proposed method achieves favorably competitive performance against many leading trackers in terms of accuracy and precision, while satisfying the real-time requirements of applications.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes