Local Competition and Stochasticity for Adversarial Robustness in Deep Learning
This work provides a novel approach to enhance adversarial robustness for deep learning models, which is a critical problem for deploying AI systems in security-sensitive applications.
This paper introduces deep networks with stochastic local winner-takes-all (LWTA) activations combined with Bayesian non-parametrics to improve adversarial robustness. The method achieves state-of-the-art performance against powerful adversarial attack schemes on benchmark datasets.
This work addresses adversarial robustness in deep learning by considering deep networks with stochastic local winner-takes-all (LWTA) activations. This type of network units result in sparse representations from each model layer, as the units are organized in blocks where only one unit generates a non-zero output. The main operating principle of the introduced units lies on stochastic arguments, as the network performs posterior sampling over competing units to select the winner. We combine these LWTA arguments with tools from the field of Bayesian non-parametrics, specifically the stick-breaking construction of the Indian Buffet Process, to allow for inferring the sub-part of each layer that is essential for modeling the data at hand. Then, inference is performed by means of stochastic variational Bayes. We perform a thorough experimental evaluation of our model using benchmark datasets. As we show, our method achieves high robustness to adversarial perturbations, with state-of-the-art performance in powerful adversarial attack schemes.