LGMLNov 1, 2018

On the Geometry of Adversarial Examples

arXiv:1811.00525v286 citations
AI Analysis

This work addresses the vulnerability of ML models to adversarial attacks, providing theoretical insights that could improve robustness, though it is incremental as it builds on existing geometric concepts.

The paper tackles the problem of adversarial examples in machine learning by proposing a geometric framework to analyze their high-dimensional geometry, proving tradeoffs in robustness, sample inefficiency of adversarial training, and conditions for robust classifiers.

Adversarial examples are a pervasive phenomenon of machine learning models where seemingly imperceptible perturbations to the input lead to misclassifications for otherwise statistically accurate models. We propose a geometric framework, drawing on tools from the manifold reconstruction literature, to analyze the high-dimensional geometry of adversarial examples. In particular, we highlight the importance of codimension: for low-dimensional data manifolds embedded in high-dimensional space there are many directions off the manifold in which to construct adversarial examples. Adversarial examples are a natural consequence of learning a decision boundary that classifies the low-dimensional data manifold well, but classifies points near the manifold incorrectly. Using our geometric framework we prove (1) a tradeoff between robustness under different norms, (2) that adversarial training in balls around the data is sample inefficient, and (3) sufficient sampling conditions under which nearest neighbor classifiers and ball-based adversarial training are robust.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes