LG MLDec 19, 2020

Sample Complexity of Adversarially Robust Linear Classification on Separated Data

Robi Bhattacharjee, Somesh Jha, Kamalika Chaudhuri

arXiv:2012.10794v35.86 citations

Originality Highly original

AI Analysis

This work addresses the problem of understanding sample efficiency in adversarial robustness for machine learning practitioners, focusing on a previously underexplored well-separated data setting, which is incremental as it builds on prior theoretical results but reveals new gaps and conditions.

The paper tackles the sample complexity of adversarially robust linear classification for well-separated data, showing that the expected robust loss can be at least Ω(d/n) for many distributions, while the max margin algorithm achieves O(1/n) standard loss, revealing a gap not captured by prior work. It also presents an algorithm that achieves O(1/n) robust loss when the robustness radius is much smaller than the class gap, indicating faster convergence is possible in very well-separated cases.

We consider the sample complexity of learning with adversarial robustness. Most prior theoretical results for this problem have considered a setting where different classes in the data are close together or overlapping. Motivated by some real applications, we consider, in contrast, the well-separated case where there exists a classifier with perfect accuracy and robustness, and show that the sample complexity narrates an entirely different story. Specifically, for linear classifiers, we show a large class of well-separated distributions where the expected robust loss of any algorithm is at least $Ω(\frac{d}{n})$, whereas the max margin algorithm has expected standard loss $O(\frac{1}{n})$. This shows a gap in the standard and robust losses that cannot be obtained via prior techniques. Additionally, we present an algorithm that, given an instance where the robustness radius is much smaller than the gap between the classes, gives a solution with expected robust loss is $O(\frac{1}{n})$. This shows that for very well-separated data, convergence rates of $O(\frac{1}{n})$ are achievable, which is not the case otherwise. Our results apply to robustness measured in any $\ell_p$ norm with $p > 1$ (including $p = \infty$).

View on arXiv PDF

Similar