Neural Networks in Adversarial Setting and Ill-Conditioned Weight Space
This addresses a security issue for neural network users by proposing a method to improve robustness against adversarial attacks, though it is incremental as it builds on existing regularization techniques.
The paper tackles the problem of neural networks' susceptibility to adversarial examples by hypothesizing that ill-conditioned weight matrices contribute to this vulnerability, and it shows that using an orthogonal regularizer to lower the condition number increases adversarial accuracy on MNIST and F-MNIST datasets.
Recently, Neural networks have seen a huge surge in its adoption due to their ability to provide high accuracy on various tasks. On the other hand, the existence of adversarial examples have raised suspicions regarding the generalization capabilities of neural networks. In this work, we focus on the weight matrix learnt by the neural networks and hypothesize that ill conditioned weight matrix is one of the contributing factors in neural network's susceptibility towards adversarial examples. For ensuring that the learnt weight matrix's condition number remains sufficiently low, we suggest using orthogonal regularizer. We show that this indeed helps in increasing the adversarial accuracy on MNIST and F-MNIST datasets.