Deterministic Gaussian Averaged Neural Networks
This work addresses the need for efficient and reliable certification of adversarial robustness in neural networks, though it is incremental as it builds on existing stochastic methods.
The paper tackles the problem of certifying neural network robustness to adversarial perturbations by introducing a deterministic method to compute Gaussian averages, which matches the performance of stochastic methods like randomized smoothing while requiring only a single model evaluation during inference.
We present a deterministic method to compute the Gaussian average of neural networks used in regression and classification. Our method is based on an equivalence between training with a particular regularized loss, and the expected values of Gaussian averages. We use this equivalence to certify models which perform well on clean data but are not robust to adversarial perturbations. In terms of certified accuracy and adversarial robustness, our method is comparable to known stochastic methods such as randomized smoothing, but requires only a single model evaluation during inference.