Dorothea Strecker

DL
4papers
19citations
Novelty16%
AI Score37

4 Papers

DLOct 10, 2023
Disappearing repositories -- taking an infrastructure perspective on the long-term availability of research data

Dorothea Strecker, Heinz Pampel, Rouven Schabinger et al.

Currently, there is limited research investigating the phenomenon of research data repositories being shut down, and the impact this has on the long-term availability of data. This paper takes an infrastructure perspective on the preservation of research data by using a registry to identify 191 research data repositories that have been closed and presenting information on the shutdown process. The results show that 6.2 % of research data repositories indexed in the registry were shut down. The risks resulting in repository shutdown are varied. The median age of a repository when shutting down is 12 years. Strategies to prevent data loss at the infrastructure level are pursued to varying extent. 44 % of the repositories in the sample migrated data to another repository, and 12 % maintain limited access to their data collection. However, both strategies are not permanent solutions. Finally, the general lack of information on repository shutdown events as well as the effect on the findability of data and the permanence of the scholarly record are discussed.

48.3DLApr 30
Thinking like a business: Reconfiguring relationships to sustain open data infrastructures

Kathleen Gregory, Dorothea Strecker

Sustaining open data infrastructures over time is a complex puzzle, involving dynamic funding models and relationships with customers, collaborators, and competitors. Despite their importance, these mechanisms are often hidden from view, limiting their applicability to other infrastructures. In this article, we examine how Dryad, a well-known open data infrastructure, has worked toward financial sustainability by reconfiguring relationships with other actors and by strategically implementing a new business model and process of assetization. We identify four types of relationship reconfigurations with customers, collaborators, and competitors critical to Dryad's financial evolution: reinforcing, forging, positioning, and excluding. We then analyze how Dryad's strategic efforts to develop a new fee structure have changed its interpretations of value(s), community, and governance, factors important in an infrastructure's longevity. We conclude by highlighting emerging tensions that provide insight for other open infrastructures working to become financially sustainable. As a whole, our analysis focuses not just on financial mechanisms for funding open data infrastructures (although those emerge) but on the relationships which enable them.

DLDec 13, 2025
How permanent are metadata for research data? Understanding changes in DataCite metadata

Dorothea Strecker

With the move towards open research information, the DOI registration agency DataCite is increasingly used as a source for metadata describing research data, for example to perform scientometric analyses. However, there is a lack of research on how DataCite metadata describing research data are created and maintained. This paper adresses this gap by using DataCite metadata provenance information to analyze the overall prevalence and patterns of change to DataCite metadata records. Metadata change was observed for 12.18 % of metadata records in the sample, and change tends to be incremental and not extensive. DataCite metadata records offer reliable descriptions of datasets and are stable enough to be used in scientometric research. The rate of change differs from previous studies of metadata change in other contexts, suggesting that there are differences in metadata practices between research data repositories and more traditional cataloging environments. The observed changes do not seem to fully align with idealized conceptualizations of metadata creation and maintenance for research data. In particular, the data does not show that metadata records are maintained routinely and continuously. Metadata change also has a limited effect on metadata completeness.

1.3DLMar 26
Improving metadata flows -- The simultaneous use of multiple metadata schemas at disciplinary research data repositories

Dorothea Strecker

This study investigates the simultaneous use of multiple metadata schemas at research data repositories. The analysis covers how eight disciplinary research data repositories from the geosciences and social sciences use disciplinary metadata schemas and the DataCite Metadata Schema, and how two metadata records describing the same dataset compare. The results show that DataCite metadata records could be improved considerably by optimizing schema crosswalks. However, the parallel use of disciplinary and multidisciplinary metadata records is complex. For example, discipline has a significant effect on the completeness of DataCite metadata. A temporal analysis also highlights that metadata workflows are diverse, and in some cases, suboptimal crosswalks are likely not the sole cause of incomplete DataCite metadata. Comparing the disciplinary metadata schemas and the DataCite Metadata Schema on a structural level reveals that most differences between schemas are the result of different approaches to modelling statements about datasets, not the lack of opportunity to express them. The element sets of both disciplinary metadata schemas and the DataCite Metadata Schema could be extended to describe datasets in more detail. These observations demonstrate that disciplinary and multidisciplinary metadata schemas serve distinct purposes. Disciplinary repositories should take full advantage of the opportunities both options provide.