DLCLOct 12, 2024

SciEvo: A 2 Million, 30-Year Cross-disciplinary Dataset for Temporal Scientometric Analysis

arXiv:2410.09510v2h-index: 11Has Code
Originality Synthesis-oriented
AI Analysis

This provides a resource for researchers in scientometrics to analyze cross-disciplinary trends, though it is incremental as it builds on existing datasets.

The authors tackled the problem of understanding scientific knowledge evolution by introducing SciEvo, a dataset with over 2 million publications spanning 30 years, and used it to reveal disparities such as shorter citation ages in fields like LLMs (2.48 years) compared to traditional disciplines (9.71 years).

Understanding the creation, evolution, and dissemination of scientific knowledge is crucial for bridging diverse subject areas and addressing complex global challenges such as pandemics, climate change, and ethical AI. Scientometrics, the quantitative and qualitative study of scientific literature, provides valuable insights into these processes. We introduce SciEvo, a longitudinal scientometric dataset with over two million academic publications, providing comprehensive contents information and citation graphs to support cross-disciplinary analyses. SciEvo is easy to use and available across platforms, including GitHub, Kaggle, and HuggingFace. Using SciEvo, we conduct a temporal study spanning over 30 years to explore key questions in scientometrics: the evolution of academic terminology, citation patterns, and interdisciplinary knowledge exchange. Our findings reveal critical insights, such as disparities in epistemic cultures, knowledge production modes, and citation practices. For example, rapidly developing, application-driven fields like LLMs exhibit significantly shorter citation age (2.48 years) compared to traditional theoretical disciplines like oral history (9.71 years). Our data and analytic tools can be accessed at https://github.com/Ahren09/SciEvo.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes