Structure Matters: Towards Generating Transferable Adversarial Images
This work addresses the challenge of generating transferable adversarial examples for image classification systems, particularly those with defense mechanisms, offering a novel approach that relaxes perturbation constraints.
The paper tackles the problem of limited transferability in adversarial attacks due to small perturbation constraints by introducing structure-aware perturbations that allow perceptible deviations while preserving naturalness, achieving strong attack ability and high transferability on MNIST and CIFAR10 datasets even against defenses.
Recent works on adversarial examples for image classification focus on directly modifying pixels with minor perturbations. The small perturbation requirement is imposed to ensure the generated adversarial examples being natural and realistic to humans, which, however, puts a curb on the attack space thus limiting the attack ability and transferability especially for systems protected by a defense mechanism. In this paper, we propose the novel concepts of structure patterns and structure-aware perturbations that relax the small perturbation constraint while still keeping images natural. The key idea of our approach is to allow perceptible deviation in adversarial examples while keeping structure patterns that are central to a human classifier. Built upon these concepts, we propose a \emph{structure-preserving attack (SPA)} for generating natural adversarial examples with extremely high transferability. Empirical results on the MNIST and the CIFAR10 datasets show that SPA exhibits strong attack ability in both the white-box and black-box setting even defenses are applied. Moreover, with the integration of PGD or CW attack, its attack ability escalates sharply under the white-box setting, without losing the outstanding transferability inherited from SPA.