LGFeb 17, 2021

Optimizing Large-Scale Hyperparameters via Automated Learning Algorithm

Bin Gu, Guodong Liu, Yanfu Zhang, Xiang Geng, Heng Huang

arXiv:2102.09026v111.922 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the challenge of tuning many hyperparameters in machine learning, which is crucial for model performance, but the approach appears incremental as it builds on existing methods.

The paper tackles the problem of hyperparameter optimization by proposing HOZOG, a method combining black-box and gradient-based approaches, achieving improved simplicity, scalability, flexibility, effectiveness, and efficiency compared to state-of-the-art methods on tasks with up to 1250 hyperparameters.

Modern machine learning algorithms usually involve tuning multiple (from one to thousands) hyperparameters which play a pivotal role in terms of model generalizability. Black-box optimization and gradient-based algorithms are two dominant approaches to hyperparameter optimization while they have totally distinct advantages. How to design a new hyperparameter optimization technique inheriting all benefits from both approaches is still an open problem. To address this challenging problem, in this paper, we propose a new hyperparameter optimization method with zeroth-order hyper-gradients (HOZOG). Specifically, we first exactly formulate hyperparameter optimization as an A-based constrained optimization problem, where A is a black-box optimization algorithm (such as deep neural network). Then, we use the average zeroth-order hyper-gradients to update hyperparameters. We provide the feasibility analysis of using HOZOG to achieve hyperparameter optimization. Finally, the experimental results on three representative hyperparameter (the size is from 1 to 1250) optimization tasks demonstrate the benefits of HOZOG in terms of simplicity, scalability, flexibility, effectiveness and efficiency compared with the state-of-the-art hyperparameter optimization methods.

View on arXiv PDF Code

Similar