COMay 16, 2022
Fast and realistic large-scale structure from machine-learning-augmented random field simulationsDavide Piras, Benjamin Joachimi, Francisco Villaescusa-Navarro
Producing thousands of simulations of the dark matter distribution in the Universe with increasing precision is a challenging but critical task to facilitate the exploitation of current and forthcoming cosmological surveys. Many inexpensive substitutes to full $N$-body simulations have been proposed, even though they often fail to reproduce the statistics of the smaller, non-linear scales. Among these alternatives, a common approximation is represented by the lognormal distribution, which comes with its own limitations as well, while being extremely fast to compute even for high-resolution density fields. In this work, we train a generative deep learning model, mainly made of convolutional layers, to transform projected lognormal dark matter density fields to more realistic dark matter maps, as obtained from full $N$-body simulations. We detail the procedure that we follow to generate highly correlated pairs of lognormal and simulated maps, which we use as our training data, exploiting the information of the Fourier phases. We demonstrate the performance of our model comparing various statistical tests with different field resolutions, redshifts and cosmological parameters, proving its robustness and explaining its current limitations. When evaluated on 100 test maps, the augmented lognormal random fields reproduce the power spectrum up to wavenumbers of $1 \ h \ \rm{Mpc}^{-1}$, and the bispectrum within 10%, and always within the error bars, of the fiducial target simulations. Finally, we describe how we plan to integrate our proposed model with existing tools to yield more accurate spherical random fields for weak lensing analysis.
COMay 27, 2025
Transfer learning for multifidelity simulation-based inference in cosmologyAlex A. Saoulis, Davide Piras, Niall Jeffrey et al.
Simulation-based inference (SBI) enables cosmological parameter estimation when closed-form likelihoods or models are unavailable. However, SBI relies on machine learning for neural compression and density estimation. This requires large training datasets which are prohibitively expensive for high-quality simulations. We overcome this limitation with multifidelity transfer learning, combining less expensive, lower-fidelity simulations with a limited number of high-fidelity simulations. We demonstrate our methodology on dark matter density maps from two separate simulation suites in the hydrodynamical CAMELS Multifield Dataset. Pre-training on dark-matter-only $N$-body simulations reduces the required number of high-fidelity hydrodynamical simulations by a factor between $8$ and $15$, depending on the model complexity, posterior dimensionality, and performance metrics used. By leveraging cheaper simulations, our approach enables performant and accurate inference on high-fidelity models while substantially reducing computational costs.
LGOct 11, 2024
Simulation-based inference with scattering representations: scattering is all you needKiyam Lin, Benjamin Joachimi, Jason D. McEwen
We demonstrate the successful use of scattering representations without further compression for simulation-based inference (SBI) with images (i.e. field-level), illustrated with a cosmological case study. Scattering representations provide a highly effective representational space for subsequent learning tasks, although the higher dimensional compressed space introduces challenges. We overcome these through spatial averaging, coupled with more expressive density estimators. Compared to alternative methods, such an approach does not require additional simulations for either training or computing derivatives, is interpretable, and resilient to covariate shift. As expected, we show that a scattering only approach extracts more information than traditional second order summary statistics.
GEO-PHJan 12, 2021
Towards fast machine-learning-assisted Bayesian posterior inference of microseismic event location and source mechanismDavide Piras, Alessio Spurio Mancini, Ana M. G. Ferreira et al.
Bayesian inference applied to microseismic activity monitoring allows the accurate location of microseismic events from recorded seismograms and the estimation of the associated uncertainties. However, the forward modelling of these microseismic events, which is necessary to perform Bayesian source inversion, can be prohibitively expensive in terms of computational resources. A viable solution is to train a surrogate model based on machine learning techniques, to emulate the forward model and thus accelerate Bayesian inference. In this paper, we substantially enhance previous work, which considered only sources with isotropic moment tensors. We train a machine learning algorithm on the power spectrum of the recorded pressure wave and show that the trained emulator allows complete and fast event locations for $\textit{any}$ source mechanism. Moreover, we show that our approach is computationally inexpensive, as it can be run in less than 1 hour on a commercial laptop, while yielding accurate results using less than $10^4$ training seismograms. We additionally demonstrate how the trained emulators can be used to identify the source mechanism through the estimation of the Bayesian evidence. Finally, we demonstrate that our approach is robust to real noise as measured in field data. This work lays the foundations for efficient, accurate future joint determinations of event location and moment tensor, and associated uncertainties, which are ultimately key for accurately characterising human-induced and natural earthquakes, and for enhanced quantitative seismic hazard assessments.