LGAug 24, 2022

A Survey of Open Source Automation Tools for Data Science Predictions

arXiv:2208.11792v1h-index: 3Has Code
Originality Synthesis-oriented
AI Analysis

This is an incremental survey paper for data scientists and practitioners, summarizing current automation tools and highlighting areas needing further development.

The paper surveys technical and cultural challenges in automating the data science prediction lifecycle for supervised learning with structured datasets, and reviews existing open source Python tools that address these challenges while identifying remaining gaps.

We present an expository overview of technical and cultural challenges to the development and adoption of automation at various stages in the data science prediction lifecycle, restricting focus to supervised learning with structured datasets. In addition, we review popular open source Python tools implementing common solution patterns for the automation challenges and highlight gaps where we feel progress still demands to be made.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes