Enhancing Generalization of Universal Adversarial Perturbation through Gradient Aggregation
This work addresses a security vulnerability in deep neural networks for adversarial machine learning, offering an incremental improvement over existing UAP generation techniques.
The paper tackles the challenge of generating universal adversarial perturbations (UAPs) that generalize across samples and models by addressing gradient vanishing and local optima issues, proposing Stochastic Gradient Aggregation (SGA) which significantly enhances UAP generalization and outperforms state-of-the-art methods on ImageNet.
Deep neural networks are vulnerable to universal adversarial perturbation (UAP), an instance-agnostic perturbation capable of fooling the target model for most samples. Compared to instance-specific adversarial examples, UAP is more challenging as it needs to generalize across various samples and models. In this paper, we examine the serious dilemma of UAP generation methods from a generalization perspective -- the gradient vanishing problem using small-batch stochastic gradient optimization and the local optima problem using large-batch optimization. To address these problems, we propose a simple and effective method called Stochastic Gradient Aggregation (SGA), which alleviates the gradient vanishing and escapes from poor local optima at the same time. Specifically, SGA employs the small-batch training to perform multiple iterations of inner pre-search. Then, all the inner gradients are aggregated as a one-step gradient estimation to enhance the gradient stability and reduce quantization errors. Extensive experiments on the standard ImageNet dataset demonstrate that our method significantly enhances the generalization ability of UAP and outperforms other state-of-the-art methods. The code is available at https://github.com/liuxuannan/Stochastic-Gradient-Aggregation.