MLLGMay 27, 2018

Defending Against Adversarial Attacks by Leveraging an Entire GAN

arXiv:1805.10652v140 citations
Originality Incremental advance
AI Analysis

This addresses the vulnerability of state-of-the-art models to adversarial perturbations, offering a defense mechanism that is independent of the classifier and attack type, though it appears incremental as it builds on existing GAN-based ideas.

The paper tackles the problem of defending against adversarial attacks on machine learning models by proposing an approach that uses both the discriminator and generator of a GAN to detect and clean adversarial samples, showing that adversarial samples lie outside the learned data manifold and providing a method that works across multiple attacks and datasets.

Recent work has shown that state-of-the-art models are highly vulnerable to adversarial perturbations of the input. We propose cowboy, an approach to detecting and defending against adversarial attacks by using both the discriminator and generator of a GAN trained on the same dataset. We show that the discriminator consistently scores the adversarial samples lower than the real samples across multiple attacks and datasets. We provide empirical evidence that adversarial samples lie outside of the data manifold learned by the GAN. Based on this, we propose a cleaning method which uses both the discriminator and generator of the GAN to project the samples back onto the data manifold. This cleaning procedure is independent of the classifier and type of attack and thus can be deployed in existing systems.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes