ML DB LGJun 28, 2018

Automatic Exploration of Machine Learning Experiments on OpenML

Daniel Kühn, Philipp Probst, Janek Thomas, Bernd Bischl

arXiv:1806.10961v311.626 citations

Originality Synthesis-oriented

AI Analysis

This provides a resource for researchers studying hyperparameter effects and tuning, though it is incremental as it focuses on data collection rather than new methods.

The paper tackles the scarcity of experimental metadata for understanding hyperparameter influence by presenting a large, open dataset of 2.5 million experiments across 38 OpenML datasets and six algorithms, generated via automated random sampling.

Understanding the influence of hyperparameters on the performance of a machine learning algorithm is an important scientific topic in itself and can help to improve automatic hyperparameter tuning procedures. Unfortunately, experimental meta data for this purpose is still rare. This paper presents a large, free and open dataset addressing this problem, containing results on 38 OpenML data sets, six different machine learning algorithms and many different hyperparameter configurations. Results where generated by an automated random sampling strategy, termed the OpenML Random Bot. Each algorithm was cross-validated up to 20.000 times per dataset with different hyperparameters settings, resulting in a meta dataset of around 2.5 million experiments overall.

View on arXiv PDF

Similar