Lipschitz regularized Deep Neural Networks generalize and are adversarially robust
This work addresses the critical issue of making deep learning models more reliable and secure against adversarial attacks, though it appears incremental by combining existing regularization techniques.
The authors tackled the problem of improving generalization and adversarial robustness in deep neural networks by using input gradient regularization, demonstrating that this approach leads to provable generalization bounds independent of network depth and empirically showing increased robustness with gradient norms useful for attack detection.
In this work we study input gradient regularization of deep neural networks, and demonstrate that such regularization leads to generalization proofs and improved adversarial robustness. The proof of generalization does not overcome the curse of dimensionality, but it is independent of the number of layers in the networks. The adversarial robustness regularization combines adversarial training, which we show to be equivalent to Total Variation regularization, with Lipschitz regularization. We demonstrate empirically that the regularized models are more robust, and that gradient norms of images can be used for attack detection.