LGOct 29, 2021

A Scalable AutoML Approach Based on Graph Neural Networks

arXiv:2111.00083v416 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of scalable and efficient AutoML for practitioners by improving pipeline recommendation accuracy, though it is incremental as it builds on existing meta-learning approaches.

The authors tackled the problem of automating machine learning pipeline creation by developing KGpip, a meta-learning system that uses dataset embeddings and graph generation to recommend pipelines, and demonstrated that it significantly outperforms state-of-the-art AutoML systems on 126 datasets.

AutoML systems build machine learning models automatically by performing a search over valid data transformations and learners, along with hyper-parameter optimization for each learner. Many AutoML systems use meta-learning to guide search for optimal pipelines. In this work, we present a novel meta-learning system called KGpip which, (1) builds a database of datasets and corresponding pipelines by mining thousands of scripts with program analysis, (2) uses dataset embeddings to find similar datasets in the database based on its content instead of metadata-based features, (3) models AutoML pipeline creation as a graph generation problem, to succinctly characterize the diverse pipelines seen for a single dataset. KGpip's meta-learning is a sub-component for AutoML systems. We demonstrate this by integrating KGpip with two AutoML systems. Our comprehensive evaluation using 126 datasets, including those used by the state-of-the-art systems, shows that KGpip significantly outperforms these systems.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes