Online Visual Tracking with One-Shot Context-Aware Domain Adaptation
This work addresses the challenge of improving robustness in online visual tracking for applications like surveillance or robotics, though it is incremental as it builds on existing domain adaptation techniques.
The paper tackles the problem of visual trackers failing to leverage background context and overfitting due to limited data by proposing a domain adaptation approach that enhances semantic background contributions and addresses data imbalance with a cost-sensitive loss. The result is a tracker achieving competitive performance at real-time speed compared to state-of-the-art methods.
Online learning policy makes visual trackers more robust against different distortions through learning domain-specific cues. However, the trackers adopting this policy fail to fully leverage the discriminative context of the background areas. Moreover, owing to the lack of sufficient data at each time step, the online learning approach can also make the trackers prone to over-fitting to the background regions. In this paper, we propose a domain adaptation approach to strengthen the contributions of the semantic background context. The domain adaptation approach is backboned with only an off-the-shelf deep model. The strength of the proposed approach comes from its discriminative ability to handle severe occlusion and background clutter challenges. We further introduce a cost-sensitive loss alleviating the dominance of non-semantic background candidates over the semantic candidates, thereby dealing with the data imbalance issue. Experimental results demonstrate that our tracker achieves competitive results at real-time speed compared to the state-of-the-art trackers.