The Rise of Large Language Models and the Direction and Impact of US Federal Research Funding

Yifan Qian, Zhe Wen, Alexander C. Furnas, Yue Bai, Erzhuo Shao, Dashun Wang

arXiv:2601.15485v11.21 citations

Originality Synthesis-oriented

AI Analysis

This research addresses how LLMs influence federal funding allocation and scientific diversity, with implications for portfolio governance and long-term impact, though it is incremental in analyzing existing data.

The study examined how large language models (LLMs) are affecting US federal research funding by analyzing NSF and NIH proposals and awards, finding that LLM use increased sharply in 2023 and is associated with lower semantic distinctiveness, with agency-dependent impacts: at NIH, it correlates with higher proposal success and publication output (though not in highly cited work), while no such effects were seen at NSF.

Federal research funding shapes the direction, diversity, and impact of the US scientific enterprise. Large language models (LLMs) are rapidly diffusing into scientific practice, holding substantial promise while raising widespread concerns. Despite growing attention to AI use in scientific writing and evaluation, little is known about how the rise of LLMs is reshaping the public funding landscape. Here, we examine LLM involvement at key stages of the federal funding pipeline by combining two complementary data sources: confidential National Science Foundation (NSF) and National Institutes of Health (NIH) proposal submissions from two large US R1 universities, including funded, unfunded, and pending proposals, and the full population of publicly released NSF and NIH awards. We find that LLM use rises sharply beginning in 2023 and exhibits a bimodal distribution, indicating a clear split between minimal and substantive use. Across both private submissions and public awards, higher LLM involvement is consistently associated with lower semantic distinctiveness, positioning projects closer to recently funded work within the same agency. The consequences of this shift are agency-dependent. LLM use is positively associated with proposal success and higher subsequent publication output at NIH, whereas no comparable associations are observed at NSF. Notably, the productivity gains at NIH are concentrated in non-hit papers rather than the most highly cited work. Together, these findings provide large-scale evidence that the rise of LLMs is reshaping how scientific ideas are positioned, selected, and translated into publicly funded research, with implications for portfolio governance, research diversity, and the long-run impact of science.

View on arXiv PDF

Similar