DL SEMay 6, 2020

Advancing computational reproducibility in the Dataverse data repository platform

Ana Trisovic, Philip Durbin, Tania Schlatter, Gustavo Durand, Sonia Barbosa, Danny Brooke, Mercè Crosas

arXiv:2005.02985v22.322 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the issue of irreproducible research for the scientific community, but it appears incremental as it builds on existing tools and repositories.

The paper tackles the problem of computational reproducibility in research by addressing the limitations of data repositories and reproducibility tools, aiming to improve the reproducibility of published and archived outputs.

Recent reproducibility case studies have raised concerns showing that much of the deposited research has not been reproducible. One of their conclusions was that the way data repositories store research data and code cannot fully facilitate reproducibility due to the absence of a runtime environment needed for the code execution. New specialized reproducibility tools provide cloud-based computational environments for code encapsulation, thus enabling research portability and reproducibility. However, they do not often enable research discoverability, standardized data citation, or long-term archival like data repositories do. This paper addresses the shortcomings of data repositories and reproducibility tools and how they could be overcome to improve the current lack of computational reproducibility in published and archived research outputs.

View on arXiv PDF

Similar