CV CRAug 22, 2024

Query-Efficient Video Adversarial Attack with Stylized Logo

Duoxun Tang, Yuxin Cao, Xi Xiao, Derui Wang, Sheng Wen, Tianqing Zhu

arXiv:2408.12099v15.22 citationsh-index: 6

Originality Incremental advance

AI Analysis

This work addresses the vulnerability of video DNNs to adversarial attacks, offering a more efficient and realistic attack method for security testing, though it is incremental in building on existing style-transfer and patch-based approaches.

The authors tackled the problem of generating realistic adversarial examples for video classification systems with limited query budgets, proposing the Stylized Logo Attack (SLA) framework that uses style references and reinforcement learning to achieve better performance than state-of-the-art methods while maintaining effectiveness against defenses.

Video classification systems based on Deep Neural Networks (DNNs) have demonstrated excellent performance in accurately verifying video content. However, recent studies have shown that DNNs are highly vulnerable to adversarial examples. Therefore, a deep understanding of adversarial attacks can better respond to emergency situations. In order to improve attack performance, many style-transfer-based attacks and patch-based attacks have been proposed. However, the global perturbation of the former will bring unnatural global color, while the latter is difficult to achieve success in targeted attacks due to the limited perturbation space. Moreover, compared to a plethora of methods targeting image classifiers, video adversarial attacks are still not that popular. Therefore, to generate adversarial examples with a low budget and to provide them with a higher verisimilitude, we propose a novel black-box video attack framework, called Stylized Logo Attack (SLA). SLA is conducted through three steps. The first step involves building a style references set for logos, which can not only make the generated examples more natural, but also carry more target class features in the targeted attacks. Then, reinforcement learning (RL) is employed to determine the style reference and position parameters of the logo within the video, which ensures that the stylized logo is placed in the video with optimal attributes. Finally, perturbation optimization is designed to optimize perturbations to improve the fooling rate in a step-by-step manner. Sufficient experimental results indicate that, SLA can achieve better performance than state-of-the-art methods and still maintain good deception effects when facing various defense methods.

View on arXiv PDF

Similar