MLCRCVLGMay 27, 2019

Scaleable input gradient regularization for adversarial robustness

arXiv:1905.11468v294 citations
Originality Incremental advance
AI Analysis

This work addresses adversarial robustness for machine learning models, offering a scalable alternative to adversarial training, though it is incremental as it builds on existing gradient regularization techniques.

The authors tackled the problem of adversarial robustness by proposing a scalable input gradient regularization method that avoids double backpropagation, achieving competitive results with adversarial training on ImageNet models trained in 33 hours on consumer GPUs.

In this work we revisit gradient regularization for adversarial robustness with some new ingredients. First, we derive new per-image theoretical robustness bounds based on local gradient information. These bounds strongly motivate input gradient regularization. Second, we implement a scaleable version of input gradient regularization which avoids double backpropagation: adversarially robust ImageNet models are trained in 33 hours on four consumer grade GPUs. Finally, we show experimentally and through theoretical certification that input gradient regularization is competitive with adversarial training. Moreover we demonstrate that gradient regularization does not lead to gradient obfuscation or gradient masking.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes