Probabilistic Rollouts for Learning Curve Extrapolation Across Hyperparameter Settings
This work addresses hyperparameter optimization by improving learning curve extrapolation, though it appears incremental as it builds on existing methods with specific enhancements.
The authors tackled the problem of extrapolating learning curves for iterative ML algorithms like SGD in deep networks, proposing probabilistic models based on random forests and Bayesian RNNs that outperformed state-of-the-art hyperparameter optimization models in prediction accuracy.
We propose probabilistic models that can extrapolate learning curves of iterative machine learning algorithms, such as stochastic gradient descent for training deep networks, based on training data with variable-length learning curves. We study instantiations of this framework based on random forests and Bayesian recurrent neural networks. Our experiments show that these models yield better predictions than state-of-the-art models from the hyperparameter optimization literature when extrapolating the performance of neural networks trained with different hyperparameter settings.