Temporally-Transferable Perturbations: Efficient, One-Shot Adversarial Attacks for Online Visual Object Trackers
This work solves the problem of high computational cost for adversarial attacks on real-time visual object trackers, which is a significant improvement for researchers studying the robustness of these systems.
This paper addresses the computational cost of adversarial attacks on real-time visual object trackers by proposing a method to generate a single, temporally transferable adversarial perturbation from the object template image. This perturbation, when added to every search image, successfully fools the tracker with virtually no cost, outperforming state-of-the-art attacks in untargeted scenarios and extending to targeted attacks.
In recent years, the trackers based on Siamese networks have emerged as highly effective and efficient for visual object tracking (VOT). While these methods were shown to be vulnerable to adversarial attacks, as most deep networks for visual recognition tasks, the existing attacks for VOT trackers all require perturbing the search region of every input frame to be effective, which comes at a non-negligible cost, considering that VOT is a real-time task. In this paper, we propose a framework to generate a single temporally transferable adversarial perturbation from the object template image only. This perturbation can then be added to every search image, which comes at virtually no cost, and still, successfully fool the tracker. Our experiments evidence that our approach outperforms the state-of-the-art attacks on the standard VOT benchmarks in the untargeted scenario. Furthermore, we show that our formalism naturally extends to targeted attacks that force the tracker to follow any given trajectory by precomputing diverse directional perturbations.