Motion-Excited Sampler: Video Adversarial Attack with Sparked Prior
This work addresses the problem of adversarial attacks on video models for security and robustness applications, representing an incremental advancement by extending image-based methods to video with a focus on motion dynamics.
The paper tackles the vulnerability of video models to adversarial attacks by utilizing intrinsic movement patterns and regional relative motion among frames, proposing a motion-excited sampler to generate motion-aware noise prior, which successfully attacks various video classification models with fewer queries, achieving competitive results on four benchmark datasets.
Deep neural networks are known to be susceptible to adversarial noise, which are tiny and imperceptible perturbations. Most of previous work on adversarial attack mainly focus on image models, while the vulnerability of video models is less explored. In this paper, we aim to attack video models by utilizing intrinsic movement pattern and regional relative motion among video frames. We propose an effective motion-excited sampler to obtain motion-aware noise prior, which we term as sparked prior. Our sparked prior underlines frame correlations and utilizes video dynamics via relative motion. By using the sparked prior in gradient estimation, we can successfully attack a variety of video classification models with fewer number of queries. Extensive experimental results on four benchmark datasets validate the efficacy of our proposed method.