LGAIMLFeb 19, 2021

A PAC-Bayes Analysis of Adversarial Robustness

arXiv:2102.11069v220 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of ensuring model invariance to imperceptible perturbations, which is crucial for security-critical applications, though it appears incremental as it builds on existing PAC-Bayesian frameworks.

The paper tackles the problem of estimating adversarial robustness in machine learning models by proposing the first general PAC-Bayesian generalization bounds, which provide tight bounds valid for any adversarial attacks and can be minimized during training to achieve robust models at test time.

We propose the first general PAC-Bayesian generalization bounds for adversarial robustness, that estimate, at test time, how much a model will be invariant to imperceptible perturbations in the input. Instead of deriving a worst-case analysis of the risk of a hypothesis over all the possible perturbations, we leverage the PAC-Bayesian framework to bound the averaged risk on the perturbations for majority votes (over the whole class of hypotheses). Our theoretically founded analysis has the advantage to provide general bounds (i) that are valid for any kind of attacks (i.e., the adversarial attacks), (ii) that are tight thanks to the PAC-Bayesian framework, (iii) that can be directly minimized during the learning phase to obtain a robust model on different attacks at test time.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes