LGCYMEJul 19, 2023

Reproducibility in Machine Learning-Driven Research

arXiv:2307.10320v137 citationsh-index: 23
Originality Synthesis-oriented
AI Analysis

It tackles the problem of low reproducibility in ML/AI research for researchers and practitioners, but is incremental as it surveys existing work without proposing new solutions.

This mini survey addresses the reproducibility crisis in machine learning-driven research by reviewing literature to reflect on the current situation, identify issues and barriers, and identify potential drivers like tools and practices to support reproducibility.

Research is facing a reproducibility crisis, in which the results and findings of many studies are difficult or even impossible to reproduce. This is also the case in machine learning (ML) and artificial intelligence (AI) research. Often, this is the case due to unpublished data and/or source-code, and due to sensitivity to ML training conditions. Although different solutions to address this issue are discussed in the research community such as using ML platforms, the level of reproducibility in ML-driven research is not increasing substantially. Therefore, in this mini survey, we review the literature on reproducibility in ML-driven research with three main aims: (i) reflect on the current situation of ML reproducibility in various research fields, (ii) identify reproducibility issues and barriers that exist in these research fields applying ML, and (iii) identify potential drivers such as tools, practices, and interventions that support ML reproducibility. With this, we hope to contribute to decisions on the viability of different solutions for supporting ML reproducibility.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes