Scale-Sensitive Shattering: Learnability and Evaluability at Optimal Scale

Shashaank Aiyer, Yishay Mansour, Shay Moran, Han Shao, Tom Waknine

arXiv:2605.1368444.6

AI Analysis

For learning theory researchers, this resolves long-standing open questions about the precise scales governing learnability and evaluability, providing tight characterizations and refuting a conjecture.

The paper establishes a scale-sensitive generalization of the fundamental theorem of PAC learning, showing equivalence between uniform convergence, agnostic learnability, and finiteness of fat-shattering dimension at optimal scales, resolving open questions and improving previous bounds. Key results include sharp entropy bounds and a dichotomy for integral probability metrics.

We study the optimal scale at which real-valued function classes exhibit uniform convergence and learnability. Our main result establishes a scale-sensitive generalization of the fundamental theorem of PAC learning: for every bounded real-valued class and every $γ>0$, uniform convergence at scale $γ$, agnostic learnability at scale $γ/2$, and finiteness of the fat-shattering dimension at every scale $γ'>γ$ are equivalent. This resolves a question by Anthony and Bartlett (Cambridge Univ. Press 1999) on the precise scales governing learnability, refuting a conjecture attributed there to Phil Long that a multiplicative 2-factor gap is unavoidable, and improves the upper bounds of Bartlett and Long (JCSS 1998), which incur such a loss. The key technical ingredient is a direct bound on empirical $\ell_\infty$ covering numbers, avoiding the standard detour through packing numbers. As a consequence, we obtain sharp asymptotic metric-entropy bounds in terms of the fat-shattering scale $γ$: an $O(\log^2 n)$ bound holds already at scale $γ/2$, while an $O(\log n)$ bound holds at scale $2γ$. We further show that the $O(\log^2 n)$ bound is sometimes tight. These results resolve open questions by Alon et al. (JACM 1997) and Rudelson and Vershynin (Ann. of Math. 2006). As an application, we establish a sharp dichotomy for bounded integral probability metrics: every such IPM is either estimable or cannot be weakly evaluated within any multiplicative factor $c<3$, while $3$-weak evaluability always holds, resolving an open question from Aiyer et al. (ICML 2026). We also highlight several open questions on quantitative sample complexity and evaluability.

View on arXiv PDF

Similar