DLCLApr 30

Measuring research data reuse in scholarly publications using generative artificial intelligence: Open Science Indicator development and preliminary results

arXiv:2604.2806146.3
AI Analysis

This provides a scalable method for metascience researchers to quantify the downstream impact of open science practices.

The authors developed an LLM-based indicator to measure research data reuse in scholarly publications, finding a 43% reuse rate, higher than established bibliometric methods, suggesting current estimates may be underestimated.

Numerous metascience studies and other initiatives have begun to monitor the prevalence of open science practices when it is more important to understand the 'downstream' effects or impacts of open science. PLOS and DataSeer have developed a new LLM-based indicator to measure an important effect of open science: the reuse of research data. Our results show a data reuse rate of 43%, which is higher than established bibliometric techniques. We show that data reuse can be measured at scale using LLMs and generative artificial intelligence. The positive effects of research data sharing and reuse may currently be underestimated.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes