Adaptive Generation of Unrestricted Adversarial Inputs
This addresses the vulnerability of neural networks to more realistic and unrestricted adversarial attacks, offering a significant speed improvement and adaptive capability, though it is incremental in advancing unrestricted attack methods.
The paper tackles the problem of generating unrestricted adversarial inputs for neural networks, introducing an adaptive algorithm that achieves a 400-2,000x speedup over prior methods and can defeat classifiers robust to perturbation-based attacks.
Neural networks are vulnerable to adversarially-constructed perturbations of their inputs. Most research so far has considered perturbations of a fixed magnitude under some $l_p$ norm. Although studying these attacks is valuable, there has been increasing interest in the construction of (and robustness to) unrestricted attacks, which are not constrained to a small and rather artificial subset of all possible adversarial inputs. We introduce a novel algorithm for generating such unrestricted adversarial inputs which, unlike prior work, is adaptive: it is able to tune its attacks to the classifier being targeted. It also offers a 400-2,000x speedup over the existing state of the art. We demonstrate our approach by generating unrestricted adversarial inputs that fool classifiers robust to perturbation-based attacks. We also show that, by virtue of being adaptive and unrestricted, our attack is able to defeat adversarial training against it.