CVCRLGFeb 2, 2022

An Eye for an Eye: Defending against Gradient-based Attacks with Gradients

arXiv:2202.01117v13 citations
Originality Incremental advance
AI Analysis

This addresses the vulnerability of deep learning models to adversarial attacks, which is a critical security issue for AI systems, though it is an incremental improvement over existing defense methods.

The paper tackles the problem of defending deep learning models against gradient-based adversarial attacks by proposing a Two-stream Restoration Network (TRN) that uses gradient maps and adversarial images as inputs to restore perturbed images, achieving state-of-the-art performance on datasets like CIFAR10, SVHN, and Fashion MNIST.

Deep learning models have been shown to be vulnerable to adversarial attacks. In particular, gradient-based attacks have demonstrated high success rates recently. The gradient measures how each image pixel affects the model output, which contains critical information for generating malicious perturbations. In this paper, we show that the gradients can also be exploited as a powerful weapon to defend against adversarial attacks. By using both gradient maps and adversarial images as inputs, we propose a Two-stream Restoration Network (TRN) to restore the adversarial images. To optimally restore the perturbed images with two streams of inputs, a Gradient Map Estimation Mechanism is proposed to estimate the gradients of adversarial images, and a Fusion Block is designed in TRN to explore and fuse the information in two streams. Once trained, our TRN can defend against a wide range of attack methods without significantly degrading the performance of benign inputs. Also, our method is generalizable, scalable, and hard to bypass. Experimental results on CIFAR10, SVHN, and Fashion MNIST demonstrate that our method outperforms state-of-the-art defense methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes