LGCRMLFeb 22, 2020

Using Single-Step Adversarial Training to Defend Iterative Adversarial Examples

arXiv:2002.09632v220 citations
AI Analysis

This addresses the computational bottleneck in adversarial training for neural network security, offering a more efficient defense against adversarial attacks.

The paper tackles the computational cost and limited defense capability of existing adversarial training methods by proposing a novel single-step adversarial training approach that defends against both single-step and iterative adversarial examples. The method achieves a 35.67% improvement in test accuracy and reduces training time by up to 76.03% compared to state-of-the-art methods on CIFAR10.

Adversarial examples have become one of the largest challenges that machine learning models, especially neural network classifiers, face. These adversarial examples break the assumption of attack-free scenario and fool state-of-the-art (SOTA) classifiers with insignificant perturbations to human. So far, researchers achieved great progress in utilizing adversarial training as a defense. However, the overwhelming computational cost degrades its applicability and little has been done to overcome this issue. Single-Step adversarial training methods have been proposed as computationally viable solutions, however they still fail to defend against iterative adversarial examples. In this work, we first experimentally analyze several different SOTA defense methods against adversarial examples. Then, based on observations from experiments, we propose a novel single-step adversarial training method which can defend against both single-step and iterative adversarial examples. Lastly, through extensive evaluations, we demonstrate that our proposed method outperforms the SOTA single-step and iterative adversarial training defense. Compared with ATDA (single-step method) on CIFAR10 dataset, our proposed method achieves 35.67% enhancement in test accuracy and 19.14% reduction in training time. When compared with methods that use BIM or Madry examples (iterative methods) on CIFAR10 dataset, it saves up to 76.03% in training time with less than 3.78% degeneration in test accuracy.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes