LG AI CV NE MLSep 16, 2019

They Might NOT Be Giants: Crafting Black-Box Adversarial Examples with Fewer Queries Using Particle Swarm Optimization

Rayan Mosli, Matthew Wright, Bo Yuan, Yin Pan

arXiv:1909.07490v16.618 citations

Originality Incremental advance

AI Analysis

This addresses the challenge of efficient and effective black-box adversarial attacks for security researchers and practitioners, though it is incremental as it builds on existing optimization techniques.

The paper tackles the problem of creating adversarial examples in black-box settings with fewer queries, achieving success rates of 99.6% on CIFAR-10, 96.3% on MNIST, and 82.0% on ImageNet while using substantially fewer queries than state-of-the-art methods.

Machine learning models have been found to be susceptible to adversarial examples that are often indistinguishable from the original inputs. These adversarial examples are created by applying adversarial perturbations to input samples, which would cause them to be misclassified by the target models. Attacks that search and apply the perturbations to create adversarial examples are performed in both white-box and black-box settings, depending on the information available to the attacker about the target. For black-box attacks, the only capability available to the attacker is the ability to query the target with specially crafted inputs and observing the labels returned by the model. Current black-box attacks either have low success rates, requires a high number of queries, or produce adversarial examples that are easily distinguishable from their sources. In this paper, we present AdversarialPSO, a black-box attack that uses fewer queries to create adversarial examples with high success rates. AdversarialPSO is based on the evolutionary search algorithm Particle Swarm Optimization, a populationbased gradient-free optimization algorithm. It is flexible in balancing the number of queries submitted to the target vs the quality of imperceptible adversarial examples. The attack has been evaluated using the image classification benchmark datasets CIFAR-10, MNIST, and Imagenet, achieving success rates of 99.6%, 96.3%, and 82.0%, respectively, while submitting substantially fewer queries than the state-of-the-art. We also present a black-box method for isolating salient features used by models when making classifications. This method, called Swarms with Individual Search Spaces or SWISS, creates adversarial examples by finding and modifying the most important features in the input.

View on arXiv PDF

Similar