Starting Small -- Learning with Adaptive Sample Sizes
This work addresses the challenge of scalable training for large datasets, offering a method that reduces computational steps, which is incremental but provides concrete speed-ups for practitioners in data-intensive ML applications.
The paper tackles the problem of efficiently training machine learning models when data is abundant and full passes are prohibitive, by introducing a novel algorithm that dynamically increases sample sizes in iterative methods like stochastic gradient descent, achieving statistical accuracy in 2n steps instead of n log n steps.
For many machine learning problems, data is abundant and it may be prohibitive to make multiple passes through the full training set. In this context, we investigate strategies for dynamically increasing the effective sample size, when using iterative methods such as stochastic gradient descent. Our interest is motivated by the rise of variance-reduced methods, which achieve linear convergence rates that scale favorably for smaller sample sizes. Exploiting this feature, we show -- theoretically and empirically -- how to obtain significant speed-ups with a novel algorithm that reaches statistical accuracy on an $n$-sample in $2n$, instead of $n \log n$ steps.