10.1DLApr 26
The software space of scienceZhouming Wu, Dakota Murray
Science advances not only through the accumulation of facts but also through the evolution of tools. Crucially, tools are rarely used in isolation. They form tool portfolios, combinations shaped by a discipline's workflows and analytical demands. Software, near-ubiquitous in modern research and traceable across the published literature, offers a unique window to study tool use in science. Here, we map the software space of science by analyzing mentions to software from 1.3 million publications from 2004 to 2021. We construct a network of 520 software tools linked by disciplinary co-usage, with link strength weighted by proximity based on revealed comparative advantage. This network reveals a structured landscape in which tools cluster into 8 functional communities, including computing and statistics, wet lab instrumentation, and several bioinformatics specializations, with each discipline occupying a distinct position reflecting its characteristic tool portfolios. The breadth of a discipline's tool portfolio is shaped by the nature of its research workflow: fields combining experimental and computational tasks draw on multiple communities, while those with narrower methodological demands concentrate in one. These structural differences are stable across the observation period. At the same time, across all broad disciplinary categories, disciplinary tool portfolios are crystallizing, settling on a common set of tools.
DLJul 30, 2021
Investigating Disagreement in the Scientific LiteratureWout S. Lamers, Kevin Boyack, Vincent Larivière et al.
Disagreement is essential to scientific progress. However, the extent of disagreement in science, its evolution over time, and the fields in which it happens, remains poorly understood. Leveraging a massive collection of English-language scientific texts, we develop a cue-phrase based approach to identify instances of disagreement citations across more than four million scientific articles. Using this method, we construct an indicator of disagreement across scientific fields over the 2000-2015 period. In contrast with black-box text classification methods, our framework is transparent and easily interpretable. We reveal a disciplinary spectrum of disagreement, with higher disagreement in the social sciences and lower disagreement in physics and mathematics. However, detailed disciplinary analysis demonstrates heterogeneity across sub-fields, revealing the importance of local disciplinary cultures and epistemic characteristics of disagreement. Paper-level analysis reveals notable episodes of disagreement in science, and illustrates how methodological artifacts can confound analyses of scientific texts. These findings contribute to a broader understanding of disagreement and establish a foundation for future research to understanding key processes underlying scientific progress.
LGDec 4, 2020
Unsupervised embedding of trajectories captures the latent structure of scientific migrationDakota Murray, Jisung Yoon, Sadamori Kojaku et al.
Human migration and mobility drives major societal phenomena including epidemics, economies, innovation, and the diffusion of ideas. Although human mobility and migration have been heavily constrained by geographic distance throughout the history, advances and globalization are making other factors such as language and culture increasingly more important. Advances in neural embedding models, originally designed for natural language, provide an opportunity to tame this complexity and open new avenues for the study of migration. Here, we demonstrate the ability of the model word2vec to encode nuanced relationships between discrete locations from migration trajectories, producing an accurate, dense, continuous, and meaningful vector-space representation. The resulting representation provides a functional distance between locations, as well as a digital double that can be distributed, re-used, and itself interrogated to understand the many dimensions of migration. We show that the unique power of word2vec to encode migration patterns stems from its mathematical equivalence with the gravity model of mobility. Focusing on the case of scientific migration, we apply word2vec to a database of three million migration trajectories of scientists derived from the affiliations listed on their publication records. Using techniques that leverage its semantic structure, we demonstrate that embeddings can learn the rich structure that underpins scientific migration, such as cultural, linguistic, and prestige relationships at multiple levels of granularity. Our results provide a theoretical foundation and methodological framework for using neural embeddings to represent and understand migration both within and beyond science.