LGSep 25, 2023

Projected Randomized Smoothing for Certified Adversarial Robustness

arXiv:2309.13794v120 citationsh-index: 30
Originality Highly original
AI Analysis

This work addresses the vulnerability of classifiers to adversarial perturbations normal to the data manifold, offering a significant improvement in certified robustness for machine learning security applications.

The paper tackles the problem of improving certified adversarial robustness for classifiers by projecting data onto a low-dimensional manifold before applying randomized smoothing, resulting in certified regions with volumes many orders of magnitude larger than state-of-the-art baselines on CIFAR-10 and SVHN datasets.

Randomized smoothing is the current state-of-the-art method for producing provably robust classifiers. While randomized smoothing typically yields robust $\ell_2$-ball certificates, recent research has generalized provable robustness to different norm balls as well as anisotropic regions. This work considers a classifier architecture that first projects onto a low-dimensional approximation of the data manifold and then applies a standard classifier. By performing randomized smoothing in the low-dimensional projected space, we characterize the certified region of our smoothed composite classifier back in the high-dimensional input space and prove a tractable lower bound on its volume. We show experimentally on CIFAR-10 and SVHN that classifiers without the initial projection are vulnerable to perturbations that are normal to the data manifold and yet are captured by the certified regions of our method. We compare the volume of our certified regions against various baselines and show that our method improves on the state-of-the-art by many orders of magnitude.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes