AILGOct 22, 2019

How can AI Automate End-to-End Data Science?

arXiv:1910.14436v116 citations
Originality Synthesis-oriented
AI Analysis

This survey aims to democratize data science by addressing its labor-intensive nature and reliance on human experts, though it is incremental as it builds on existing approaches.

The paper tackles the problem of making data science more accessible and scalable by introducing and defining the Automated Data Science (AutoDS) challenge, proposing a general framework, and reviewing existing literature to provide guidelines for future research.

Data science is labor-intensive and human experts are scarce but heavily involved in every aspect of it. This makes data science time consuming and restricted to experts with the resulting quality heavily dependent on their experience and skills. To make data science more accessible and scalable, we need its democratization. Automated Data Science (AutoDS) is aimed towards that goal and is emerging as an important research and business topic. We introduce and define the AutoDS challenge, followed by a proposal of a general AutoDS framework that covers existing approaches but also provides guidance for the development of new methods. We categorize and review the existing literature from multiple aspects of the problem setup and employed techniques. Then we provide several views on how AI could succeed in automating end-to-end AutoDS. We hope this survey can serve as insightful guideline for the AutoDS field and provide inspiration for future research.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes