LGAICRFeb 1, 2018

Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples

arXiv:1802.00420v43521 citations
Originality Highly original
AI Analysis

This work exposes a critical flaw in many existing adversarial defense methods, impacting the security of machine learning systems against attacks.

The paper tackled the problem of obfuscated gradients in adversarial defense methods, showing that they create a false sense of security by defeating iterative attacks but can be circumvented with new techniques, successfully breaking 6 out of 7 defenses in a case study.

We identify obfuscated gradients, a kind of gradient masking, as a phenomenon that leads to a false sense of security in defenses against adversarial examples. While defenses that cause obfuscated gradients appear to defeat iterative optimization-based attacks, we find defenses relying on this effect can be circumvented. We describe characteristic behaviors of defenses exhibiting the effect, and for each of the three types of obfuscated gradients we discover, we develop attack techniques to overcome it. In a case study, examining non-certified white-box-secure defenses at ICLR 2018, we find obfuscated gradients are a common occurrence, with 7 of 9 defenses relying on obfuscated gradients. Our new attacks successfully circumvent 6 completely, and 1 partially, in the original threat model each paper considers.

Code Implementations4 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes