A Formalization of Robustness for Deep Neural Networks
This work provides a foundational framework for analyzing adversarial attacks in deep learning, which is incremental as it formalizes existing concepts rather than introducing new methods.
The authors tackled the problem of deep neural networks lacking robustness to small input perturbations by proposing a unifying formalization of adversarial input generation from a formal methods perspective, resulting in a general definition of robustness that captures different formulations and models various attack techniques.
Deep neural networks have been shown to lack robustness to small input perturbations. The process of generating the perturbations that expose the lack of robustness of neural networks is known as adversarial input generation. This process depends on the goals and capabilities of the adversary, In this paper, we propose a unifying formalization of the adversarial input generation process from a formal methods perspective. We provide a definition of robustness that is general enough to capture different formulations. The expressiveness of our formalization is shown by modeling and comparing a variety of adversarial attack techniques.