LG MLMar 24, 2020

Defense Through Diverse Directions

Christopher M. Bender, Yang Li, Yifeng Shi, Michael K. Reiter, Junier B. Oliva

arXiv:2003.10602v13.34 citations

Originality Incremental advance

AI Analysis

This addresses the problem of adversarial vulnerability in AI systems for security-critical applications, offering a novel approach that is incremental but improves upon existing Bayesian methods.

The paper tackles adversarial robustness in neural networks by developing a Bayesian method that distributes uncertainty evenly across inputs to avoid reliance on brittle features, achieving strong empirical robustness on benchmark datasets without online adversarial training.

In this work we develop a novel Bayesian neural network methodology to achieve strong adversarial robustness without the need for online adversarial training. Unlike previous efforts in this direction, we do not rely solely on the stochasticity of network weights by minimizing the divergence between the learned parameter distribution and a prior. Instead, we additionally require that the model maintain some expected uncertainty with respect to all input covariates. We demonstrate that by encouraging the network to distribute evenly across inputs, the network becomes less susceptible to localized, brittle features which imparts a natural robustness to targeted perturbations. We show empirical robustness on several benchmark datasets.

View on arXiv PDF

Similar