Adversarial Scratches: Deployable Attacks to CNN Classifiers
This addresses the challenge of making adversarial attacks practical for physical targets, such as traffic signs, rather than just digital images, though it is incremental in improving deployability over existing methods.
The paper tackles the problem of creating deployable adversarial attacks on CNN classifiers by introducing Adversarial Scratches, a method that uses scratches in images to achieve a higher fooling rate than other state-of-the-art deployable attacks while requiring fewer queries and modifying very few pixels.
A growing body of work has shown that deep neural networks are susceptible to adversarial examples. These take the form of small perturbations applied to the model's input which lead to incorrect predictions. Unfortunately, most literature focuses on visually imperceivable perturbations to be applied to digital images that often are, by design, impossible to be deployed to physical targets. We present Adversarial Scratches: a novel L0 black-box attack, which takes the form of scratches in images, and which possesses much greater deployability than other state-of-the-art attacks. Adversarial Scratches leverage Bézier Curves to reduce the dimension of the search space and possibly constrain the attack to a specific location. We test Adversarial Scratches in several scenarios, including a publicly available API and images of traffic signs. Results show that, often, our attack achieves higher fooling rate than other deployable state-of-the-art methods, while requiring significantly fewer queries and modifying very few pixels.