LG CR MLFeb 10, 2020

Random Smoothing Might be Unable to Certify $\ell_\infty$ Robustness for High-Dimensional Images

Avrim Blum, Travis Dick, Naren Manoj, Hongyang Zhang

arXiv:2002.03517v322.184 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses a key open problem in adversarial machine learning by proving a hardness result that limits the applicability of random smoothing for ℓ_∞ robustness in high-dimensional settings.

The paper tackles the problem of certifying adversarial robustness for high-dimensional images using random smoothing under ℓ_p norms with p>2, showing that any noise distribution providing such robustness must have variance scaling with dimension, which leads to trivial classifiers for bounded pixel values.

We show a hardness result for random smoothing to achieve certified adversarial robustness against attacks in the $\ell_p$ ball of radius $ε$ when $p>2$. Although random smoothing has been well understood for the $\ell_2$ case using the Gaussian distribution, much remains unknown concerning the existence of a noise distribution that works for the case of $p>2$. This has been posed as an open problem by Cohen et al. (2019) and includes many significant paradigms such as the $\ell_\infty$ threat model. In this work, we show that any noise distribution $\mathcal{D}$ over $\mathbb{R}^d$ that provides $\ell_p$ robustness for all base classifiers with $p>2$ must satisfy $\mathbb{E}η_i^2=Ω(d^{1-2/p}ε^2(1-δ)/δ^2)$ for 99% of the features (pixels) of vector $η\sim\mathcal{D}$, where $ε$ is the robust radius and $δ$ is the score gap between the highest-scored class and the runner-up. Therefore, for high-dimensional images with pixel values bounded in $[0,255]$, the required noise will eventually dominate the useful information in the images, leading to trivial smoothed classifiers.

View on arXiv PDF Code

Similar