LGCRCVMLJul 3, 2019

Minimally distorted Adversarial Examples with a Fast Adaptive Boundary Attack

arXiv:1907.02044v2604 citations
Originality Incremental advance
AI Analysis

This work addresses the need for efficient and scalable adversarial attack methods to test classifier robustness, though it is incremental as it builds on existing attack paradigms.

The authors tackled the problem of evaluating neural network robustness by proposing a new white-box adversarial attack that finds minimal perturbations across multiple l_p-norms, achieving better or similar performance to state-of-the-art methods while being robust to gradient masking.

The evaluation of robustness against adversarial manipulation of neural networks-based classifiers is mainly tested with empirical attacks as methods for the exact computation, even when available, do not scale to large networks. We propose in this paper a new white-box adversarial attack wrt the $l_p$-norms for $p \in \{1,2,\infty\}$ aiming at finding the minimal perturbation necessary to change the class of a given input. It has an intuitive geometric meaning, yields quickly high quality results, minimizes the size of the perturbation (so that it returns the robust accuracy at every threshold with a single run). It performs better or similar to state-of-the-art attacks which are partially specialized to one $l_p$-norm, and is robust to the phenomenon of gradient masking.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes