DL AI LGMay 7, 2024

Can citations tell us about a paper's reproducibility? A case study of machine learning papers

Rochana R. Obadage, Sarah M. Rajtmajer, Jian Wu

arXiv:2405.03977v12.36 citationsh-index: 7Has CodeACM-REP

Originality Incremental advance

AI Analysis

This addresses the challenge of evaluating reproducibility for researchers and practitioners in ML/AI, though it is incremental as it builds on existing sentiment analysis methods.

The study tackled the problem of assessing reproducibility in machine learning papers by analyzing citation contexts, finding that sentiment in citations correlates with reproducibility scores, with classifiers achieving up to 85% accuracy in identifying reproducibility-related contexts.

The iterative character of work in machine learning (ML) and artificial intelligence (AI) and reliance on comparisons against benchmark datasets emphasize the importance of reproducibility in that literature. Yet, resource constraints and inadequate documentation can make running replications particularly challenging. Our work explores the potential of using downstream citation contexts as a signal of reproducibility. We introduce a sentiment analysis framework applied to citation contexts from papers involved in Machine Learning Reproducibility Challenges in order to interpret the positive or negative outcomes of reproduction attempts. Our contributions include training classifiers for reproducibility-related contexts and sentiment analysis, and exploring correlations between citation context sentiment and reproducibility scores. Study data, software, and an artifact appendix are publicly available at https://github.com/lamps-lab/ccair-ai-reproducibility .

View on arXiv PDF Code

Similar