SEIRMar 20

yProv4DV: Reproducible Data Visualization Scripts Out of the Box

arXiv:2603.2043724.4h-index: 2
AI Analysis

This addresses a gap in reproducible research workflows for researchers and practitioners by focusing on script-based visualization practices, though it is incremental as it builds on existing reproducibility solutions.

The paper tackles the problem of sharing plots without the necessary components for independent reproduction by introducing yProv4DV, a lightweight library that enables reproducible data visualization scripts through provenance tracking with minimal code modifications.

While results visualization is a critical phase to the communication of new academic results, plots are frequently shared without the complete combination of code, input data, execution context and outputs required to independently reproduce the resulting figures. Existing reproducibility solutions tend to focus on computational pipelines or workflow management systems, not covering script-based visualization practices commonly used by researchers and practitioners. Additionally, the minimalist nature of current Python data visualization libraries tend to speed up the creation of images, disincentivizing users from spending time integrating additional tools into these short scripts. This paper proposes yProv4DV, a library lightweight designed to enable reproducible data visualization scripts through the use of provenance information, minimizing the necessity for code modifications. Through a single call, users can track inputs, outputs and source code files, enabling saving and full reproducibility of their data visualization software. As a result, this library fills a gap in reproducible research workflows by addressing the reproducibility of plots in scientific publications.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes