Experimental Investigation and Evaluation of Model-based Hyperparameter Optimization
This work addresses hyperparameter tuning for practitioners using machine learning in production, but it is incremental as it builds on existing concepts and tools like mlr and SPOT.
The paper tackles the problem of hyperparameter optimization for machine learning algorithms by providing an experimental analysis of 30 hyperparameters from six algorithms, including two tuning studies and an extensive global tuning study, with results such as a new consensus ranking method for analyzing multiple algorithms.
Machine learning algorithms such as random forests or xgboost are gaining more importance and are increasingly incorporated into production processes in order to enable comprehensive digitization and, if possible, automation of processes. Hyperparameters of these algorithms used have to be set appropriately, which can be referred to as hyperparameter tuning or optimization. Based on the concept of tunability, this article presents an overview of theoretical and practical results for popular machine learning algorithms. This overview is accompanied by an experimental analysis of 30 hyperparameters from six relevant machine learning algorithms. In particular, it provides (i) a survey of important hyperparameters, (ii) two parameter tuning studies, and (iii) one extensive global parameter tuning study, as well as (iv) a new way, based on consensus ranking, to analyze results from multiple algorithms. The R package mlr is used as a uniform interface to the machine learning models. The R package SPOT is used to perform the actual tuning (optimization). All additional code is provided together with this paper.