LG CR CVApr 26, 2023

Improving Adversarial Transferability via Intermediate-level Perturbation Decay

Qizhang Li, Yiwen Guo, Wangmeng Zuo, Hao Chen

arXiv:2304.13410v320.443 citationsh-index: 103Has Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of crafting more transferable adversarial examples for machine learning models, which is important for security testing and robustness evaluation, but it is incremental as it builds on existing intermediate-level attack frameworks.

The paper tackles the problem of sub-optimal adversarial transferability in intermediate-level attacks by proposing a single-stage optimization method called ILPD, which improves attack success rates by an average of +10.07% on ImageNet and +3.88% on CIFAR-10 compared to state-of-the-art methods.

Intermediate-level attacks that attempt to perturb feature representations following an adversarial direction drastically have shown favorable performance in crafting transferable adversarial examples. Existing methods in this category are normally formulated with two separate stages, where a directional guide is required to be determined at first and the scalar projection of the intermediate-level perturbation onto the directional guide is enlarged thereafter. The obtained perturbation deviates from the guide inevitably in the feature space, and it is revealed in this paper that such a deviation may lead to sub-optimal attack. To address this issue, we develop a novel intermediate-level method that crafts adversarial examples within a single stage of optimization. In particular, the proposed method, named intermediate-level perturbation decay (ILPD), encourages the intermediate-level perturbation to be in an effective adversarial direction and to possess a great magnitude simultaneously. In-depth discussion verifies the effectiveness of our method. Experimental results show that it outperforms state-of-the-arts by large margins in attacking various victim models on ImageNet (+10.07% on average) and CIFAR-10 (+3.88% on average). Our code is at https://github.com/qizhangli/ILPD-attack.

View on arXiv PDF Code

Similar