Hot Swapping for Online Adaptation of Optimization Hyperparameters
This addresses the challenge of tuning hyperparameters in real-time for machine learning practitioners, though it appears incremental as it builds on existing bandit strategies.
The paper tackles the problem of online adaptation of optimization hyperparameters by introducing a 'hot swapping' framework, which uses a multi-armed bandit strategy for adaptive learning rate selection and achieves consistently better solutions than methods like AdaDelta and stochastic gradient with hyperparameter search on a benchmark neural network.
We describe a general framework for online adaptation of optimization hyperparameters by `hot swapping' their values during learning. We investigate this approach in the context of adaptive learning rate selection using an explore-exploit strategy from the multi-armed bandit literature. Experiments on a benchmark neural network show that the hot swapping approach leads to consistently better solutions compared to well-known alternatives such as AdaDelta and stochastic gradient with exhaustive hyperparameter search.