Bridging Literature and the Universe Via A Multi-Agent Large Language Model System
This work addresses a domain-specific challenge for physicists by automating parameter extraction and analysis in cosmology research, though it is incremental as it applies existing multi-agent and LLM methods to a new application area.
The authors tackled the problem of extracting simulation parameters from dense cosmological literature and translating them into executable scripts, which is time-consuming and error-prone, by introducing SimAgents, a multi-agent LLM system that automates this process and demonstrates strong performance on a dataset of over 40 simulations.
As cosmological simulations and their associated software become increasingly complex, physicists face the challenge of searching through vast amounts of literature and user manuals to extract simulation parameters from dense academic papers, each using different models and formats. Translating these parameters into executable scripts remains a time-consuming and error-prone process. To improve efficiency in physics research and accelerate the cosmological simulation process, we introduce SimAgents, a multi-agent system designed to automate both parameter configuration from the literature and preliminary analysis for cosmology research. SimAgents is powered by specialized LLM agents capable of physics reasoning, simulation software validation, and tool execution. These agents collaborate through structured communication, ensuring that extracted parameters are physically meaningful, internally consistent, and software-compliant. We also construct a cosmological parameter extraction evaluation dataset by collecting over 40 simulations in published papers from Arxiv and leading journals that cover diverse simulation types. Experiments on the dataset demonstrate a strong performance of SimAgents, highlighting its effectiveness and potential to accelerate scientific research for physicists. Our demonstration video is available at: https://youtu.be/w1zLpm_CaWA. The complete system and dataset are publicly available at https://github.com/xwzhang98/SimAgents.