Rupert A. C. Croft

h-index50

5papers

110citations

Novelty31%

AI Score34

Ranked #112,049 of 194,257 authors (top 58%)#60 in CO (top 50%)

5 Papers

4.3COApr 4, 2023Code

The CAMELS project: Expanding the galaxy formation model space with new ASTRID and 28-parameter TNG and SIMBA suites

Yueying Ni, Shy Genel, Daniel Anglés-Alcázar et al.

We present CAMELS-ASTRID, the third suite of hydrodynamical simulations in the Cosmology and Astrophysics with MachinE Learning (CAMELS) project, along with new simulation sets that extend the model parameter space based on the previous frameworks of CAMELS-TNG and CAMELS-SIMBA, to provide broader training sets and testing grounds for machine-learning algorithms designed for cosmological studies. CAMELS-ASTRID employs the galaxy formation model following the ASTRID simulation and contains 2,124 hydrodynamic simulation runs that vary 3 cosmological parameters ($Ω_m$, $σ_8$, $Ω_b$) and 4 parameters controlling stellar and AGN feedback. Compared to the existing TNG and SIMBA simulation suites in CAMELS, the fiducial model of ASTRID features the mildest AGN feedback and predicts the least baryonic effect on the matter power spectrum. The training set of ASTRID covers a broader variation in the galaxy populations and the baryonic impact on the matter power spectrum compared to its TNG and SIMBA counterparts, which can make machine-learning models trained on the ASTRID suite exhibit better extrapolation performance when tested on other hydrodynamic simulation sets. We also introduce extension simulation sets in CAMELS that widely explore 28 parameters in the TNG and SIMBA models, demonstrating the enormity of the overall galaxy formation model parameter space and the complex non-linear interplay between cosmology and astrophysical processes. With the new simulation suites, we show that building robust machine-learning models favors training and testing on the largest possible diversity of galaxy formation models. We also demonstrate that it is possible to train accurate neural networks to infer cosmological parameters using the high-dimensional TNG-SB28 simulation set.

3.3IMJul 11, 2025Code

Bridging Literature and the Universe Via A Multi-Agent Large Language Model System

Xiaowen Zhang, Zhenyu Bi, Patrick Lachance et al.

As cosmological simulations and their associated software become increasingly complex, physicists face the challenge of searching through vast amounts of literature and user manuals to extract simulation parameters from dense academic papers, each using different models and formats. Translating these parameters into executable scripts remains a time-consuming and error-prone process. To improve efficiency in physics research and accelerate the cosmological simulation process, we introduce SimAgents, a multi-agent system designed to automate both parameter configuration from the literature and preliminary analysis for cosmology research. SimAgents is powered by specialized LLM agents capable of physics reasoning, simulation software validation, and tool execution. These agents collaborate through structured communication, ensuring that extracted parameters are physically meaningful, internally consistent, and software-compliant. We also construct a cosmological parameter extraction evaluation dataset by collecting over 40 simulations in published papers from Arxiv and leading journals that cover diverse simulation types. Experiments on the dataset demonstrate a strong performance of SimAgents, highlighting its effectiveness and potential to accelerate scientific research for physicists. Our demonstration video is available at: https://youtu.be/w1zLpm_CaWA. The complete system and dataset are publicly available at https://github.com/xwzhang98/SimAgents.

6.6COMay 3, 2021

AI-assisted super-resolution cosmological simulations II: Halo substructures, velocities and higher order statistics

Yueying Ni, Yin Li, Patrick Lachance et al.

In this work, we expand and test the capabilities of our recently developed super-resolution (SR) model to generate high-resolution (HR) realizations of the full phase-space matter distribution, including both displacement and velocity, from computationally cheap low-resolution (LR) cosmological N-body simulations. The SR model enhances the simulation resolution by generating 512 times more tracer particles, extending into the deeply non-linear regime where complex structure formation processes take place. We validate the SR model by deploying the model in 10 test simulations of box size 100 Mpc/h, and examine the matter power spectra, bispectra and 2D power spectra in redshift space. We find the generated SR field matches the true HR result at percent level down to scales of k ~ 10 h/Mpc. We also identify and inspect dark matter halos and their substructures. Our SR model generate visually authentic small-scale structures, that cannot be resolved by the LR input, and are in good statistical agreement with the real HR results. The SR model performs satisfactorily on the halo occupation distribution, halo correlations in both real and redshift space, and the pairwise velocity distribution, matching the HR results with comparable scatter, thus demonstrating its potential in making mock halo catalogs. The SR technique can be a powerful and promising tool for modelling small-scale galaxy formation physics in large cosmological volumes.

11.3COOct 13, 2020

AI-assisted super-resolution cosmological simulations

Yin Li, Yueying Ni, Rupert A. C. Croft et al.

Cosmological simulations of galaxy formation are limited by finite computational resources. We draw from the ongoing rapid advances in Artificial Intelligence (specifically Deep Learning) to address this problem. Neural networks have been developed to learn from high-resolution (HR) image data, and then make accurate super-resolution (SR) versions of different low-resolution (LR) images. We apply such techniques to LR cosmological N-body simulations, generating SR versions. Specifically, we are able to enhance the simulation resolution by generating 512 times more particles and predicting their displacements from the initial positions. Therefore our results can be viewed as new simulation realizations themselves rather than projections, e.g., to their density fields. Furthermore, the generation process is stochastic, enabling us to sample the small-scale modes conditioning on the large-scale environment. Our model learns from only 16 pairs of small-volume LR-HR simulations, and is then able to generate SR simulations that successfully reproduce the HR matter power spectrum to percent level up to $16\,h^{-1}\mathrm{Mpc}$, and the HR halo mass function to within $10 \%$ down to $10^{11} \, M_\odot$. We successfully deploy the model in a box 1000 times larger than the training simulation box, showing that high-resolution mock surveys can be generated rapidly. We conclude that AI assistance has the potential to revolutionize modeling of small-scale galaxy formation physics in large cosmological volumes.

2.3IMJan 31, 2019

Towards Machine-assisted Meta-Studies: The Hubble Constant

Tom Crossland, Pontus Stenetorp, Sebastian Riedel et al.

We present an approach for automatic extraction of measured values from the astrophysical literature, using the Hubble constant for our pilot study. Our rules-based model -- a classical technique in natural language processing -- has successfully extracted 298 measurements of the Hubble constant, with uncertainties, from the 208,541 available arXiv astrophysics papers. We have also created an artificial neural network classifier to identify papers in arXiv which report novel measurements. From the analysis of our results we find that reporting measurements with uncertainties and the correct units is critical information when distinguishing novel measurements in free text. Our results correctly highlight the current tension for measurements of the Hubble constant and recover the $3.5σ$ discrepancy -- demonstrating that the tool presented in this paper is useful for meta-studies of astrophysical measurements from a large number of publications.