LGMar 15, 2021

Constant Random Perturbations Provide Adversarial Robustness with Minimal Effect on Accuracy

Bronya Roni Chernyak, Bhiksha Raj, Tamir Hazan, Joseph Keshet

arXiv:2103.08265v23.11 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the challenge of adversarial attacks for machine learning practitioners by offering an incremental improvement in robustness with minimal accuracy trade-offs.

The paper tackles the problem of improving adversarial robustness in neural networks without significantly reducing standard accuracy by proposing a non-adversarial training technique that creates constant random perturbations around training examples. Results on MNIST, SVHN, and CIFAR-10 show improved accuracy over other defenses and increased robustness compared to vanilla adversarial training.

This paper proposes an attack-independent (non-adversarial training) technique for improving adversarial robustness of neural network models, with minimal loss of standard accuracy. We suggest creating a neighborhood around each training example, such that the label is kept constant for all inputs within that neighborhood. Unlike previous work that follows a similar principle, we apply this idea by extending the training set with multiple perturbations for each training example, drawn from within the neighborhood. These perturbations are model independent, and remain constant throughout the entire training process. We analyzed our method empirically on MNIST, SVHN, and CIFAR-10, under different attacks and conditions. Results suggest that the proposed approach improves standard accuracy over other defenses while having increased robustness compared to vanilla adversarial training.

View on arXiv PDF Code

Similar