A general framework for defining and optimizing robustness
This work addresses the need for a precise and flexible robustness framework for researchers and practitioners in machine learning, particularly for safety and transferability applications, but it is incremental as it builds on existing robustness investigations.
The authors tackled the lack of a common foundation for robustness concepts in neural networks by proposing a general framework for defining and optimizing robustness properties, independent of accuracy and applicable to various classification models, and introduced new learning approaches based on neural network co-training for specific robustness objectives.
Robustness of neural networks has recently attracted a great amount of interest. The many investigations in this area lack a precise common foundation of robustness concepts. Therefore, in this paper, we propose a rigorous and flexible framework for defining different types of robustness properties for classifiers. Our robustness concept is based on postulates that robustness of a classifier should be considered as a property that is independent of accuracy, and that it should be defined in purely mathematical terms without reliance on algorithmic procedures for its measurement. We develop a very general robustness framework that is applicable to any type of classification model, and that encompasses relevant robustness concepts for investigations ranging from safety against adversarial attacks to transferability of models to new domains. For two prototypical, distinct robustness objectives we then propose new learning approaches based on neural network co-training strategies for obtaining image classifiers optimized for these respective objectives.