Automated Machine Learning with Monte-Carlo Tree Search
This addresses the challenge of automating machine learning pipeline optimization for practitioners, though it appears incremental as it builds on existing methods like MCTS and Bayesian optimization.
The paper tackles the AutoML problem of selecting algorithms and hyperparameters for optimal dataset performance by introducing Mosaic, a Monte-Carlo tree search-based approach, which achieves statistically significant gains over Auto-Sklearn on the OpenML 100 benchmark and Scikit-learn portfolio.
The AutoML task consists of selecting the proper algorithm in a machine learning portfolio, and its hyperparameter values, in order to deliver the best performance on the dataset at hand. Mosaic, a Monte-Carlo tree search (MCTS) based approach, is presented to handle the AutoML hybrid structural and parametric expensive black-box optimization problem. Extensive empirical studies are conducted to independently assess and compare: i) the optimization processes based on Bayesian optimization or MCTS; ii) its warm-start initialization; iii) the ensembling of the solutions gathered along the search. Mosaic is assessed on the OpenML 100 benchmark and the Scikit-learn portfolio, with statistically significant gains over Auto-Sklearn, winner of former international AutoML challenges.