SELGJun 12, 2020

dagger: A Python Framework for Reproducible Machine Learning Experiment Orchestration

arXiv:2006.07484v18 citations
AI Analysis

This addresses reproducibility issues for researchers conducting multi-stage experiments, such as in neural network pruning, but it is incremental as it builds on existing ML frameworks.

The authors tackled the problem of managing complex, multi-stage machine learning experiments by introducing dagger, a Python framework for reproducible experiment orchestration, which helps track experimental provenance and state trees.

Many research directions in machine learning, particularly in deep learning, involve complex, multi-stage experiments, commonly involving state-mutating operations acting on models along multiple paths of execution. Although machine learning frameworks provide clean interfaces for defining model architectures and unbranched flows, burden is often placed on the researcher to track experimental provenance, that is, the state tree that leads to a final model configuration and result in a multi-stage experiment. Originally motivated by analysis reproducibility in the context of neural network pruning research, where multi-stage experiment pipelines are common, we present dagger, a framework to facilitate reproducible and reusable experiment orchestration. We describe the design principles of the framework and example usage.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes