A Law of Robustness beyond Isoperimetry
This work addresses theoretical gaps in understanding robust interpolation beyond isoperimetric distributions, with implications for overparametrization and limits in machine learning robustness.
The paper tackles the robust interpolation problem for arbitrary data distributions, proving a Lipschitzness lower bound of Ω(√(n/p)) for neural networks and Ω(n^(1/d)) for arbitrary approximators, validating and extending prior conjectures on robustness laws.
We study the robust interpolation problem of arbitrary data distributions supported on a bounded space and propose a two-fold law of robustness. Robust interpolation refers to the problem of interpolating $n$ noisy training data points in $\mathbb{R}^d$ by a Lipschitz function. Although this problem has been well understood when the samples are drawn from an isoperimetry distribution, much remains unknown concerning its performance under generic or even the worst-case distributions. We prove a Lipschitzness lower bound $Ω(\sqrt{n/p})$ of the interpolating neural network with $p$ parameters on arbitrary data distributions. With this result, we validate the law of robustness conjecture in prior work by Bubeck, Li, and Nagaraj on two-layer neural networks with polynomial weights. We then extend our result to arbitrary interpolating approximators and prove a Lipschitzness lower bound $Ω(n^{1/d})$ for robust interpolation. Our results demonstrate a two-fold law of robustness: i) we show the potential benefit of overparametrization for smooth data interpolation when $n=\mathrm{poly}(d)$, and ii) we disprove the potential existence of an $O(1)$-Lipschitz robust interpolating function when $n=\exp(ω(d))$.