DBLGSEOct 21, 2022

Management of Machine Learning Lifecycle Artifacts: A Survey

arXiv:2210.11831v161 citationsh-index: 32
Originality Synthesis-oriented
AI Analysis

This is an incremental survey that helps researchers and practitioners compare and select tools for ML artifact management.

The paper surveys over 60 systems and platforms for managing machine learning lifecycle artifacts, such as datasets and models, to address challenges in comparability and reproducibility, deriving assessment criteria from a systematic literature review.

The explorative and iterative nature of developing and operating machine learning (ML) applications leads to a variety of artifacts, such as datasets, features, models, hyperparameters, metrics, software, configurations, and logs. In order to enable comparability, reproducibility, and traceability of these artifacts across the ML lifecycle steps and iterations, systems and tools have been developed to support their collection, storage, and management. It is often not obvious what precise functional scope such systems offer so that the comparison and the estimation of synergy effects between candidates are quite challenging. In this paper, we aim to give an overview of systems and platforms which support the management of ML lifecycle artifacts. Based on a systematic literature review, we derive assessment criteria and apply them to a representative selection of more than 60 systems and platforms.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes