MLCVLGNEMar 9, 2020

Manifold Regularization for Locally Stable Deep Neural Networks

arXiv:2003.04286v215 citations
AI Analysis

This work addresses the critical issue of adversarial robustness for deep learning models, offering an incremental improvement with efficient regularization techniques.

The paper tackles the problem of training locally stable deep neural networks against various perturbations by applying manifold regularization concepts, achieving 40% adversarial accuracy on CIFAR-10 against adaptive PGD attacks and state-of-the-art verified accuracy of 21%.

We apply concepts from manifold regularization to develop new regularization techniques for training locally stable deep neural networks. Our regularizers are based on a sparsification of the graph Laplacian which holds with high probability when the data is sparse in high dimensions, as is common in deep learning. Empirically, our networks exhibit stability in a diverse set of perturbation models, including $\ell_2$, $\ell_\infty$, and Wasserstein-based perturbations; in particular, we achieve 40% adversarial accuracy on CIFAR-10 against an adaptive PGD attack using $\ell_\infty$ perturbations of size $ε= 8/255$, and state-of-the-art verified accuracy of 21% in the same perturbation model. Furthermore, our techniques are efficient, incurring overhead on par with two additional parallel forward passes through the network.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes