Reducing Runtime by Recycling Samples
This addresses runtime optimization for practitioners using stochastic variance reduction methods in machine learning, though it is incremental as it builds on existing methods.
The paper tackles the problem of runtime inefficiency in stochastic variance reduction methods by proposing to reuse previously used samples instead of fresh ones, demonstrating empirically for SDCA, SAG, and SVRG that this can reduce runtime and revealing that running SDCA for an integer number of epochs may be wasteful.
Contrary to the situation with stochastic gradient descent, we argue that when using stochastic methods with variance reduction, such as SDCA, SAG or SVRG, as well as their variants, it could be beneficial to reuse previously used samples instead of fresh samples, even when fresh samples are available. We demonstrate this empirically for SDCA, SAG and SVRG, studying the optimal sample size one should use, and also uncover be-havior that suggests running SDCA for an integer number of epochs could be wasteful.