LGCRCVMLFeb 23, 2018

Adversarial vulnerability for any classifier

arXiv:1802.08686v2265 citations
AI Analysis

This addresses the critical issue of adversarial attacks for machine learning practitioners, offering theoretical insights into inherent limitations of classifier robustness.

The paper tackles the problem of adversarial vulnerability in classifiers by deriving fundamental upper bounds on robustness under a smooth generative model assumption, proving the existence of transferable adversarial perturbations with small risk, and showing that these bounds provide informative baselines on several datasets.

Despite achieving impressive performance, state-of-the-art classifiers remain highly vulnerable to small, imperceptible, adversarial perturbations. This vulnerability has proven empirically to be very intricate to address. In this paper, we study the phenomenon of adversarial perturbations under the assumption that the data is generated with a smooth generative model. We derive fundamental upper bounds on the robustness to perturbations of any classification function, and prove the existence of adversarial perturbations that transfer well across different classifiers with small risk. Our analysis of the robustness also provides insights onto key properties of generative models, such as their smoothness and dimensionality of latent space. We conclude with numerical experimental results showing that our bounds provide informative baselines to the maximal achievable robustness on several datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes