Parsimonious Black-Box Adversarial Attacks via Efficient Combinatorial Optimization
This work addresses the challenge of efficient adversarial attacks for security testing of neural networks, representing an incremental improvement in query efficiency.
The paper tackles the problem of generating adversarial examples in black-box settings where only query access to the network is available, achieving state-of-the-art attack performance on Cifar-10 and ImageNet with significantly reduced query counts compared to recent methods.
Solving for adversarial examples with projected gradient descent has been demonstrated to be highly effective in fooling the neural network based classifiers. However, in the black-box setting, the attacker is limited only to the query access to the network and solving for a successful adversarial example becomes much more difficult. To this end, recent methods aim at estimating the true gradient signal based on the input queries but at the cost of excessive queries. We propose an efficient discrete surrogate to the optimization problem which does not require estimating the gradient and consequently becomes free of the first order update hyperparameters to tune. Our experiments on Cifar-10 and ImageNet show the state of the art black-box attack performance with significant reduction in the required queries compared to a number of recently proposed methods. The source code is available at https://github.com/snu-mllab/parsimonious-blackbox-attack.