ReabsNet: Detecting and Revising Adversarial Examples
This addresses the security problem of adversarial attacks in machine learning systems, offering a defense mechanism that revises rather than rejects adversarial inputs, though it appears incremental as it builds on existing detection methods.
The paper tackles the vulnerability of deep neural networks to adversarial perturbations by proposing ReabsNet, a network that detects and revises adversarial examples to achieve high classification accuracy, demonstrating that it outperforms state-of-the-art defense methods under various attacks.
Though deep neural network has hit a huge success in recent studies and applica- tions, it still remains vulnerable to adversarial perturbations which are imperceptible to humans. To address this problem, we propose a novel network called ReabsNet to achieve high classification accuracy in the face of various attacks. The approach is to augment an existing classification network with a guardian network to detect if a sample is natural or has been adversarially perturbed. Critically, instead of simply rejecting adversarial examples, we revise them to get their true labels. We exploit the observation that a sample containing adversarial perturbations has a possibility of returning to its true class after revision. We demonstrate that our ReabsNet outperforms the state-of-the-art defense method under various adversarial attacks.