How many samples are needed to leverage smoothness?
This work addresses a core challenge in statistical learning for practitioners, focusing on sample efficiency in high-dimensional settings, though it is incremental in refining theoretical understanding.
The paper formalizes the intuition that learning smooth functions requires many samples to estimate high-order derivatives, deriving new lower bounds on generalization error to investigate the role of constants and transitory regimes in practical scenarios.
A core principle in statistical learning is that smoothness of target functions allows to break the curse of dimensionality. However, learning a smooth function seems to require enough samples close to one another to get meaningful estimate of high-order derivatives, which would be hard in machine learning problems where the ratio between number of data and input dimension is relatively small. By deriving new lower bounds on the generalization error, this paper formalizes such an intuition, before investigating the role of constants and transitory regimes which are usually not depicted beyond classical learning theory statements while they play a dominant role in practice.