Towards Automated Deep Learning: Efficient Joint Neural Architecture and Hyperparameter Search
This work addresses inefficiencies in automated deep learning for researchers and practitioners, but it is incremental as it builds on existing NAS and hyperparameter optimization techniques.
The paper tackles the suboptimal separation of neural architecture search and hyperparameter tuning by showing that architectural choices and hyperparameters interact, and proposes a joint search method using Bayesian optimization and Hyperband to improve efficiency.
While existing work on neural architecture search (NAS) tunes hyperparameters in a separate post-processing step, we demonstrate that architectural choices and other hyperparameter settings interact in a way that can render this separation suboptimal. Likewise, we demonstrate that the common practice of using very few epochs during the main NAS and much larger numbers of epochs during a post-processing step is inefficient due to little correlation in the relative rankings for these two training regimes. To combat both of these problems, we propose to use a recent combination of Bayesian optimization and Hyperband for efficient joint neural architecture and hyperparameter search.