A single gradient step finds adversarial examples on random two-layers neural networks
This addresses the vulnerability of neural networks to adversarial attacks, providing theoretical insights for researchers in machine learning security, though it is incremental as it builds on earlier work.
The paper extends a prior result by proving that a single gradient step can find adversarial examples on random two-layer neural networks, including overcomplete cases where neuron count exceeds input dimension, and for networks with smooth activation functions.
Daniely and Schacham recently showed that gradient descent finds adversarial examples on random undercomplete two-layers ReLU neural networks. The term "undercomplete" refers to the fact that their proof only holds when the number of neurons is a vanishing fraction of the ambient dimension. We extend their result to the overcomplete case, where the number of neurons is larger than the dimension (yet also subexponential in the dimension). In fact we prove that a single step of gradient descent suffices. We also show this result for any subexponential width random neural network with smooth activation function.