LGCRCVMay 20, 2017

Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods

arXiv:1705.07263v21983 citations
AI Analysis

This work highlights a critical vulnerability in current defenses against adversarial attacks, impacting security in AI systems, and is incremental in testing existing methods.

The paper tackled the problem of detecting adversarial examples in neural networks by evaluating ten recent detection methods, showing that all can be bypassed through new loss functions, indicating adversarial examples are harder to detect than previously thought.

Neural networks are known to be vulnerable to adversarial examples: inputs that are close to natural inputs but classified incorrectly. In order to better understand the space of adversarial examples, we survey ten recent proposals that are designed for detection and compare their efficacy. We show that all can be defeated by constructing new loss functions. We conclude that adversarial examples are significantly harder to detect than previously appreciated, and the properties believed to be intrinsic to adversarial examples are in fact not. Finally, we propose several simple guidelines for evaluating future proposed defenses.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes