SI CL DLFeb 11, 2025

Hidden Division of Labor in Scientific Teams Revealed Through 1.6 Million LaTeX Files

arXiv:2502.07263v13.33 citationsh-index: 2

Originality Incremental advance

AI Analysis

This addresses the issue of biased credit allocation in science, providing large-scale evidence to inform institutional policies, though it is incremental as it builds on existing concerns about authorship practices.

The study tackled the problem of obscured individual contributions in coauthored scientific papers by analyzing 1.6 million LaTeX files to reveal a hidden division of labor, showing that some authors focus on conceptual sections while others on technical sections, with validation precision of 0.87 and Spearman's rho of 0.6.

Recognition of individual contributions is fundamental to the scientific reward system, yet coauthored papers obscure who did what. Traditional proxies-author order and career stage-reinforce biases, while contribution statements remain self-reported and limited to select journals. We construct the first large-scale dataset on writing contributions by analyzing author-specific macros in LaTeX files from 1.6 million papers (1991-2023) by 2 million scientists. Validation against self-reported statements (precision = 0.87), author order patterns, field-specific norms, and Overleaf records (Spearman's rho = 0.6, p < 0.05) confirms the reliability of the created data. Using explicit section information, we reveal a hidden division of labor within scientific teams: some authors primarily contribute to conceptual sections (e.g., Introduction and Discussion), while others focus on technical sections (e.g., Methods and Experiments). These findings provide the first large-scale evidence of implicit labor division in scientific teams, challenging conventional authorship practices and informing institutional policies on credit allocation.

View on arXiv PDF

Similar