MLMay 15, 2017

Probabilistic Matrix Factorization for Automated Machine Learning

arXiv:1705.05355v2145 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of automating machine learning for practitioners, though it appears incremental as it builds on existing techniques.

The paper tackles the problem of automating machine learning pipeline selection and tuning by combining collaborative filtering and Bayesian optimization, and shows that the approach quickly identifies high-performing pipelines, significantly outperforming the current state-of-the-art.

In order to achieve state-of-the-art performance, modern machine learning techniques require careful data pre-processing and hyperparameter tuning. Moreover, given the ever increasing number of machine learning models being developed, model selection is becoming increasingly important. Automating the selection and tuning of machine learning pipelines consisting of data pre-processing methods and machine learning models, has long been one of the goals of the machine learning community. In this paper, we tackle this meta-learning task by combining ideas from collaborative filtering and Bayesian optimization. Using probabilistic matrix factorization techniques and acquisition functions from Bayesian optimization, we exploit experiments performed in hundreds of different datasets to guide the exploration of the space of possible pipelines. In our experiments, we show that our approach quickly identifies high-performing pipelines across a wide range of datasets, significantly outperforming the current state-of-the-art.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes