CV IVNov 28, 2022

Imperceptible Adversarial Attack via Invertible Neural Networks

Zihan Chen, Ziyue Wang, Junjie Huang, Wentao Zhao, Xiao Liu, Dejian Guan

CMU

arXiv:2211.15030v311.734 citationsh-index: 12Has Code

Originality Highly original

AI Analysis

This addresses the challenge of creating stealthy adversarial attacks for machine learning models, which is an incremental improvement in adversarial robustness research.

The paper tackles the problem of generating imperceptible adversarial examples by introducing AdvINN, which uses invertible neural networks to add target class information and remove original class information, resulting in more robust and less perceptible adversarial images compared to state-of-the-art methods on datasets like CIFAR-10, CIFAR-100, and ImageNet-1K.

Adding perturbations via utilizing auxiliary gradient information or discarding existing details of the benign images are two common approaches for generating adversarial examples. Though visual imperceptibility is the desired property of adversarial examples, conventional adversarial attacks still generate traceable adversarial perturbations. In this paper, we introduce a novel Adversarial Attack via Invertible Neural Networks (AdvINN) method to produce robust and imperceptible adversarial examples. Specifically, AdvINN fully takes advantage of the information preservation property of Invertible Neural Networks and thereby generates adversarial examples by simultaneously adding class-specific semantic information of the target class and dropping discriminant information of the original class. Extensive experiments on CIFAR-10, CIFAR-100, and ImageNet-1K demonstrate that the proposed AdvINN method can produce less imperceptible adversarial images than the state-of-the-art methods and AdvINN yields more robust adversarial examples with high confidence compared to other adversarial attacks.

View on arXiv PDF Code

Similar