Bayesian Adversarial Spheres: Bayesian Inference and Adversarial Examples in a Noiseless Setting
This work addresses adversarial robustness for machine learning practitioners, but it is incremental as it builds on existing toy models and focuses on a noiseless, linear setting.
The paper tackled the problem of adversarial examples in deep learning by analyzing Bayesian inference methods using a simplified adversarial spheres setup, finding that Bayesian methods offer advantages over non-Bayesian approaches but face challenges for deep learning applications.
Modern deep neural network models suffer from adversarial examples, i.e. confidently misclassified points in the input space. It has been shown that Bayesian neural networks are a promising approach for detecting adversarial points, but careful analysis is problematic due to the complexity of these models. Recently Gilmer et al. (2018) introduced adversarial spheres, a toy set-up that simplifies both practical and theoretical analysis of the problem. In this work, we use the adversarial sphere set-up to understand the properties of approximate Bayesian inference methods for a linear model in a noiseless setting. We compare predictions of Bayesian and non-Bayesian methods, showcasing the advantages of the former, although revealing open challenges for deep learning applications.