Colored Noise Injection for Training Adversarially Robust Neural Networks
This work addresses the problem of adversarial robustness for deep learning models, offering an incremental improvement over existing noise-based defenses.
The paper tackles the vulnerability of neural networks to adversarial perturbations by extending adversarial training with colored noise injection, achieving higher adversarial accuracy on CIFAR-10 and CIFAR-100 datasets compared to previous methods like PNI.
Even though deep learning has shown unmatched performance on various tasks, neural networks have been shown to be vulnerable to small adversarial perturbations of the input that lead to significant performance degradation. In this work we extend the idea of adding white Gaussian noise to the network weights and activations during adversarial training (PNI) to the injection of colored noise for defense against common white-box and black-box attacks. We show that our approach outperforms PNI and various previous approaches in terms of adversarial accuracy on CIFAR-10 and CIFAR-100 datasets. In addition, we provide an extensive ablation study of the proposed method justifying the chosen configurations.