MLLGAug 11, 2017

OpenML Benchmarking Suites

arXiv:1708.03731v3224 citations
Originality Synthesis-oriented
AI Analysis

This addresses the problem of inconsistent and non-reproducible benchmarks for machine learning researchers, though it is incremental as it builds on existing platforms and tools.

The paper tackles the need for standardized, reproducible machine learning benchmarks by introducing OpenML benchmarking suites, which provide curated task collections and software tools integrated into the OpenML platform, resulting in the OpenML-CC18 suite for classification.

Machine learning research depends on objectively interpretable, comparable, and reproducible algorithm benchmarks. We advocate the use of curated, comprehensive suites of machine learning tasks to standardize the setup, execution, and reporting of benchmarks. We enable this through software tools that help to create and leverage these benchmarking suites. These are seamlessly integrated into the OpenML platform, and accessible through interfaces in Python, Java, and R. OpenML benchmarking suites (a) are easy to use through standardized data formats, APIs, and client libraries; (b) come with extensive meta-information on the included datasets; and (c) allow benchmarks to be shared and reused in future studies. We then present a first, carefully curated and practical benchmarking suite for classification: the OpenML Curated Classification benchmarking suite 2018 (OpenML-CC18). Finally, we discuss use cases and applications which demonstrate the usefulness of OpenML benchmarking suites and the OpenML-CC18 in particular.

Code Implementations4 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes