LG MLJul 25, 2022

AMLB: an AutoML Benchmark

Pieter Gijsbers, Marcos L. P. Bueno, Stefan Coors, Erin LeDell, Sébastien Poirier, Janek Thomas, Bernd Bischl, Joaquin Vanschoren

arXiv:2207.12560v225.7103 citationsh-index: 48Has Code

Originality Synthesis-oriented

AI Analysis

This provides a standardized tool for researchers and practitioners to reliably compare AutoML frameworks, addressing a common problem in the field, though it is incremental as it builds on existing benchmarking practices.

The authors tackled the challenge of comparing AutoML frameworks by introducing an open, extensible benchmark that follows best practices, conducting a thorough evaluation of 9 frameworks across 104 tasks to assess accuracy, inference time, and failures.

Comparing different AutoML frameworks is notoriously challenging and often done incorrectly. We introduce an open and extensible benchmark that follows best practices and avoids common mistakes when comparing AutoML frameworks. We conduct a thorough comparison of 9 well-known AutoML frameworks across 71 classification and 33 regression tasks. The differences between the AutoML frameworks are explored with a multi-faceted analysis, evaluating model accuracy, its trade-offs with inference time, and framework failures. We also use Bradley-Terry trees to discover subsets of tasks where the relative AutoML framework rankings differ. The benchmark comes with an open-source tool that integrates with many AutoML frameworks and automates the empirical evaluation process end-to-end: from framework installation and resource allocation to in-depth evaluation. The benchmark uses public data sets, can be easily extended with other AutoML frameworks and tasks, and has a website with up-to-date results.

View on arXiv PDF Code

Similar