CVCRLGMLDec 3, 2018

Disentangling Adversarial Robustness and Generalization

arXiv:1812.00740v2322 citations
AI Analysis

This work addresses a foundational issue in machine learning by clarifying the relationship between adversarial robustness and generalization, potentially enabling more reliable models, though it is incremental in building on existing hypotheses.

The paper tackles the problem of whether adversarial robustness and generalization are conflicting goals in deep networks, showing through theoretical assumptions and experiments on synthetic and real datasets that both robust and accurate models are possible, with on-manifold adversarial examples identified as generalization errors.

Obtaining deep networks that are robust against adversarial examples and generalize well is an open problem. A recent hypothesis even states that both robust and accurate models are impossible, i.e., adversarial robustness and generalization are conflicting goals. In an effort to clarify the relationship between robustness and generalization, we assume an underlying, low-dimensional data manifold and show that: 1. regular adversarial examples leave the manifold; 2. adversarial examples constrained to the manifold, i.e., on-manifold adversarial examples, exist; 3. on-manifold adversarial examples are generalization errors, and on-manifold adversarial training boosts generalization; 4. regular robustness and generalization are not necessarily contradicting goals. These assumptions imply that both robust and accurate models are possible. However, different models (architectures, training strategies etc.) can exhibit different robustness and generalization characteristics. To confirm our claims, we present extensive experiments on synthetic data (with known manifold) as well as on EMNIST, Fashion-MNIST and CelebA.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes