Neural Network Robustness as a Verification Property: A Principled Case Study
This work addresses the need for systematic comparison and verification of robustness properties in neural networks, which is incremental as it builds on existing research without introducing new methods.
The paper tackles the problem of inconsistent and poorly understood notions of neural network robustness to adversarial attacks by establishing general principles for empirical analysis and evaluation, and demonstrates practical benefits through a case study.
Neural networks are very successful at detecting patterns in noisy data, and have become the technology of choice in many fields. However, their usefulness is hampered by their susceptibility to adversarial attacks. Recently, many methods for measuring and improving a network's robustness to adversarial perturbations have been proposed, and this growing body of research has given rise to numerous explicit or implicit notions of robustness. Connections between these notions are often subtle, and a systematic comparison between them is missing in the literature. In this paper we begin addressing this gap, by setting up general principles for the empirical analysis and evaluation of a network's robustness as a mathematical property - during the network's training phase, its verification, and after its deployment. We then apply these principles and conduct a case study that showcases the practical benefits of our general approach.