Bayes Optimal Early Stopping Policies for Black-Box Optimization
This work addresses the challenge of efficient algorithm selection and early stopping in hyperparameter tuning for machine learning practitioners, offering significant speed-ups but being incremental as it builds on existing methods like Hyperband.
The paper tackles the problem of adaptively restarting randomized algorithms in black-box optimization to minimize expected time to success, deriving an optimal policy based on Bayesian priors and observed run features. On CIFAR-10 and ImageNet hyperparameter tuning, the policies achieve up to a 13x improvement over random search and up to a 3x improvement over a baseline adaptive policy in expected time to reach target accuracy.
We derive an optimal policy for adaptively restarting a randomized algorithm, based on observed features of the run-so-far, so as to minimize the expected time required for the algorithm to successfully terminate. Given a suitable Bayesian prior, this result can be used to select the optimal black-box optimization algorithm from among a large family of algorithms that includes random search, Successive Halving, and Hyperband. On CIFAR-10 and ImageNet hyperparameter tuning problems, the proposed policies offer up to a factor of 13 improvement over random search in terms of expected time to reach a given target accuracy, and up to a factor of 3 improvement over a baseline adaptive policy that terminates a run whenever its accuracy is below-median.