GAOct 24, 2022
$\texttt{Mangrove}$: Learning Galaxy Properties from Merger TreesChristian Kragh Jespersen, Miles Cranmer, Peter Melchior et al. · cambridge, princeton
Efficiently mapping baryonic properties onto dark matter is a major challenge in astrophysics. Although semi-analytic models (SAMs) and hydrodynamical simulations have made impressive advances in reproducing galaxy observables across cosmologically significant volumes, these methods still require significant computation times, representing a barrier to many applications. Graph Neural Networks (GNNs) have recently proven to be the natural choice for learning physical relations. Among the most inherently graph-like structures found in astrophysics are the dark matter merger trees that encode the evolution of dark matter halos. In this paper we introduce a new, graph-based emulator framework, $\texttt{Mangrove}$, and show that it emulates the galactic stellar mass, cold gas mass and metallicity, instantaneous and time-averaged star formation rate, and black hole mass -- as predicted by a SAM -- with root mean squared error up to two times lower than other methods across a $(75 Mpc/h)^3$ simulation box in 40 seconds, 4 orders of magnitude faster than the SAM. We show that $\texttt{Mangrove}$ allows for quantification of the dependence of galaxy properties on merger history. We compare our results to the current state of the art in the field and show significant improvements for all target properties. $\texttt{Mangrove}$ is publicly available.
92.1MLMay 25Code
DiscoverPhysics: Benchmarking LLMs for Out-of-the-Box Scientific ThinkingMatt L. Wiemann, Lindsay M. Smith, Peter Melchior et al.
Frontier LLMs now perform strongly across a wide range of physics evaluations, but it is hard to disentangle genuine reasoning from recall of established science. We introduce DiscoverPhysics, an interactive benchmark that asks a LLM agent to discover the laws of motion of a simulated world whose physics deliberately deviates from our own. We construct 22 worlds governed by, among others, screened and fractional-power gravity, multi-species couplings, hidden dark-matter-like particles, non-coordinate-free physics, and time-varying interactions. Each world is generated on demand by an N-body simulator, for which the agent proposes several rounds of experiments, observes raw trajectory data, and ultimately submits both a natural-language explanation of the world's physics and a Python implementation of the inferred law. Because solving a world requires the agent to design informative experiments and revise its hypotheses, the benchmark probes long-horizon reasoning over an experimental history. We evaluate submissions along two complementary axes: trajectory MSE on held-out particles and an LLM-judged explanation score following an expert-written rubric assessing conceptual understanding of each world. Across eleven frontier models, we find that the strongest agents pass only half of the worlds and consistently fail on those where latent structure must be uncovered. Open-source models lag substantially behind commercial models, both in their ability to design informative experiments and in extracting conclusions from the data. We further find that good predictive accuracy does not guarantee high explanation quality and that conceptual understanding depends on hypothesis refinement through well-chosen experiments.
IMJan 20
Opportunities in AI/ML for the Rubin LSST Dark Energy Science CollaborationLSST Dark Energy Science Collaboration, Eric Aubourg, Camille Avestruz et al.
The Vera C. Rubin Observatory's Legacy Survey of Space and Time (LSST) will produce unprecedented volumes of heterogeneous astronomical data (images, catalogs, and alerts) that challenge traditional analysis pipelines. The LSST Dark Energy Science Collaboration (DESC) aims to derive robust constraints on dark energy and dark matter from these data, requiring methods that are statistically powerful, scalable, and operationally reliable. Artificial intelligence and machine learning (AI/ML) are already embedded across DESC science workflows, from photometric redshifts and transient classification to weak lensing inference and cosmological simulations. Yet their utility for precision cosmology hinges on trustworthy uncertainty quantification, robustness to covariate shift and model misspecification, and reproducible integration within scientific pipelines. This white paper surveys the current landscape of AI/ML across DESC's primary cosmological probes and cross-cutting analyses, revealing that the same core methodologies and fundamental challenges recur across disparate science cases. Since progress on these cross-cutting challenges would benefit multiple probes simultaneously, we identify key methodological research priorities, including Bayesian inference at scale, physics-informed methods, validation frameworks, and active learning for discovery. With an eye on emerging techniques, we also explore the potential of the latest foundation model methodologies and LLM-driven agentic AI systems to reshape DESC workflows, provided their deployment is coupled with rigorous evaluation and governance. Finally, we discuss critical software, computing, data infrastructure, and human capital requirements for the successful deployment of these new methodologies, and consider associated risks and opportunities for broader coordination with external actors.
COOct 29, 2020Code
deep21: a Deep Learning Method for 21cm Foreground RemovalT. Lucas Makinen, Lachlan Lancaster, Francisco Villaescusa-Navarro et al.
We seek to remove foreground contaminants from 21cm intensity mapping observations. We demonstrate that a deep convolutional neural network (CNN) with a UNet architecture and three-dimensional convolutions, trained on simulated observations, can effectively separate frequency and spatial patterns of the cosmic neutral hydrogen (HI) signal from foregrounds in the presence of noise. Cleaned maps recover cosmological clustering statistics within 10% at all relevant angular scales and frequencies. This amounts to a reduction in prediction variance of over an order of magnitude on small angular scales ($\ell > 300$), and improved accuracy for small radial scales ($k_{\parallel} > 0.17\ \rm h\ Mpc^{-1})$ compared to standard Principal Component Analysis (PCA) methods. We estimate posterior confidence intervals for the network's prediction by training an ensemble of UNets. Our approach demonstrates the feasibility of analyzing 21cm intensity maps, as opposed to derived summary statistics, for upcoming radio experiments, as long as the simulated foreground model is sufficiently realistic. We provide the code used for this analysis on Github https://github.com/tlmakinen/deep21 as well as a browser-based tutorial for the experiment and UNet model via the accompanying http://bit.ly/deep21-colab Colab notebook.
OCAug 30, 2017Code
Block-Simultaneous Direction Method of Multipliers: A proximal primal-dual splitting algorithm for nonconvex problems with multiple constraintsFred Moolekamp, Peter Melchior
We introduce a generalization of the linearized Alternating Direction Method of Multipliers to optimize a real-valued function $f$ of multiple arguments with potentially multiple constraints $g_\circ$ on each of them. The function $f$ may be nonconvex as long as it is convex in every argument, while the constraints $g_\circ$ need to be convex but not smooth. If $f$ is smooth, the proposed Block-Simultaneous Direction Method of Multipliers (bSDMM) can be interpreted as a proximal analog to inexact coordinate descent methods under constraints. Unlike alternative approaches for joint solvers of multiple-constraint problems, we do not require linear operators $L$ of a constraint function $g(L\ \cdot)$ to be invertible or linked between each other. bSDMM is well-suited for a range of optimization problems, in particular for data analysis, where $f$ is the likelihood function of a model and $L$ could be a transformation matrix describing e.g. finite differences or basis transforms. We apply bSDMM to the Non-negative Matrix Factorization task of a hyperspectral unmixing problem and demonstrate convergence and effectiveness of multiple constraints on both matrix factors. The algorithms are implemented in python and released as an open-source package.
LGOct 30, 2023
Multiscale Feature Attribution for OutliersJeff Shen, Peter Melchior
Machine learning techniques can automatically identify outliers in massive datasets, much faster and more reproducible than human inspection ever could. But finding such outliers immediately leads to the question: which features render this input anomalous? We propose a new feature attribution method, Inverse Multiscale Occlusion, that is specifically designed for outliers, for which we have little knowledge of the type of features we want to identify and expect that the model performance is questionable because anomalous test data likely exceed the limits of the training data. We demonstrate our method on outliers detected in galaxy spectra from the Dark Energy Survey Instrument and find its results to be much more interpretable than alternative attribution approaches.
LGOct 11, 2024
Path-minimizing Latent ODEs for improved extrapolation and inferenceMatt L. Sampson, Peter Melchior
Latent ODE models provide flexible descriptions of dynamic systems, but they can struggle with extrapolation and predicting complicated non-linear dynamics. The latent ODE approach implicitly relies on encoders to identify unknown system parameters and initial conditions, whereas the evaluation times are known and directly provided to the ODE solver. This dichotomy can be exploited by encouraging time-independent latent representations. By replacing the common variational penalty in latent space with an $\ell_2$ penalty on the path length of each system, the models learn data representations that can easily be distinguished from those of systems with different configurations. This results in faster training, smaller models, more accurate interpolation and long-time extrapolation compared to the baseline ODE models with GRU, RNN, and LSTM encoder/decoders on tests with damped harmonic oscillator, self-gravitating fluid, and predator-prey systems. We also demonstrate superior results for simulation-based inference of the Lotka-Volterra parameters and initial conditions by using the latents as data summaries for a conditional normalizing flow. Our change to the training loss is agnostic to the specific recognition network used by the decoder and can therefore easily be adopted by other latent ODE models.
LGNov 21, 2025
A novel approach to classification of ECG arrhythmia types with latent ODEsAngelina Yan, Matt L. Sampson, Peter Melchior
12-lead ECGs with high sampling frequency are the clinical gold standard for arrhythmia detection, but their short-term, spot-check nature often misses intermittent events. Wearable ECGs enable long-term monitoring but suffer from irregular, lower sampling frequencies due to battery constraints, making morphology analysis challenging. We present an end-to-end classification pipeline to address these issues. We train a latent ODE to model continuous ECG waveforms and create robust feature vectors from high-frequency single-channel signals. We construct three latent vectors per waveform via downsampling the initial 360 Hz ECG to 90 Hz and 45 Hz. We then use a gradient boosted tree to classify these vectors and test robustness across frequencies. Performance shows minimal degradation, with macro-averaged AUC-ROC values of 0.984, 0.978, and 0.976 at 360 Hz, 90 Hz, and 45 Hz, respectively, suggesting a way to sidestep the trade-off between signal fidelity and battery life. This enables smaller wearables, promoting long-term monitoring of cardiac health.
COOct 22, 2025
Transfer Learning Beyond the Standard ModelVeena Krishnaraj, Adrian E. Bayer, Christian Kragh Jespersen et al.
Machine learning enables powerful cosmological inference but typically requires many high-fidelity simulations covering many cosmological models. Transfer learning offers a way to reduce the simulation cost by reusing knowledge across models. We show that pre-training on the standard model of cosmology, $Λ$CDM, and fine-tuning on various beyond-$Λ$CDM scenarios -- including massive neutrinos, modified gravity, and primordial non-Gaussianities -- can enable inference with significantly fewer beyond-$Λ$CDM simulations. However, we also show that negative transfer can occur when strong physical degeneracies exist between $Λ$CDM and beyond-$Λ$CDM parameters. We consider various transfer architectures, finding that including bottleneck structures provides the best performance. Our findings illustrate the opportunities and pitfalls of foundation-model approaches in physics: pre-training can accelerate inference, but may also hinder learning new physics.
LGSep 27, 2025
Dynamics of Learning: Generative Schedules from Latent ODEsMatt L. Sampson, Peter Melchior
The learning rate schedule is one of the most impactful aspects of neural network optimization, yet most schedules either follow simple parametric functions or react only to short-term training signals. None of them are supported by a comprehensive temporal view of how well neural networks actually train. We present a new learning rate scheduler that models the training performance of neural networks as a dynamical system. It leverages training runs from a hyperparameter search to learn a latent representation of the training process. Given current training metrics, it predicts the future learning rate schedule with the best long-term validation performance. Our scheduler generalizes beyond previously observed training dynamics and creates specialized schedules that deviate noticeably from common parametric functions. It achieves SOTA results for image classification with CNN and ResNet models as well as for next-token prediction with a transformer model. The trained models are located in flatter regions of the loss landscape and thus provide better generalization than those trained with other schedules. Our method is computationally efficient, optimizer-agnostic, and can easily be layered on top of ML experiment-tracking platforms. An implementation of our scheduler will be made available after acceptance.
AISep 2, 2025
The Future of Artificial Intelligence and the Mathematical and Physical Sciences (AI+MPS)Andrew Ferguson, Marisa LaFleur, Lars Ruthotto et al. · stanford
This community paper developed out of the NSF Workshop on the Future of Artificial Intelligence (AI) and the Mathematical and Physics Sciences (MPS), which was held in March 2025 with the goal of understanding how the MPS domains (Astronomy, Chemistry, Materials Research, Mathematical Sciences, and Physics) can best capitalize on, and contribute to, the future of AI. We present here a summary and snapshot of the MPS community's perspective, as of Spring/Summer 2025, in a rapidly developing field. The link between AI and MPS is becoming increasingly inextricable; now is a crucial moment to strengthen the link between AI and Science by pursuing a strategy that proactively and thoughtfully leverages the potential of AI for scientific discovery and optimizes opportunities to impact the development of AI by applying concepts from fundamental science. To achieve this, we propose activities and strategic priorities that: (1) enable AI+MPS research in both directions; (2) build up an interdisciplinary community of AI+MPS researchers; and (3) foster education and workforce development in AI for MPS researchers and students. We conclude with a summary of suggested priorities for funding agencies, educational institutions, and individual researchers to help position the MPS community to be a leader in, and take full advantage of, the transformative potential of AI+MPS.
GAMar 5, 2025
The optical and infrared are connectedChristian K. Jespersen, Peter Melchior, David N. Spergel et al.
Galaxies are often modelled as composites of separable components with distinct spectral signatures, implying that different wavelength ranges are only weakly correlated. They are not. We present a data-driven model which exploits subtle correlations between physical processes to accurately predict infrared (IR) WISE photometry from a neural summary of optical SDSS spectra. The model achieves accuracies of $χ^2_N \approx 1$ for all photometric bands in WISE, as well as good colors. We are also able to tightly constrain typically IR-derived properties, e.g. the bolometric luminosities of AGN and dust parameters such as $\mathrm{q_{PAH}}$. We find that current SED-fitting methods are incapable of making comparable predictions, and that model misspecification often leads to correlated biases in star-formation rates and AGN luminosities. To help improve SED models, we determine what features of the optical spectrum are responsible for our improved predictions, and identify several lines (CaII, SrII, FeI, [OII] and H$α$), which point to the complex chronology of star formation and chemical enrichment being incorrectly modelled.
IMSep 27, 2021
Graph Neural Network-based Resource Allocation Strategies for Multi-Object SpectroscopyTianshu Wang, Peter Melchior
Resource allocation problems are often approached with linear programming techniques. But many concrete allocation problems in the experimental and observational sciences cannot or should not be expressed in the form of linear objective functions. Even if the objective is linear, its parameters may not be known beforehand because they depend on the results of the experiment for which the allocation is to be determined. To address these challenges, we present a bipartite Graph Neural Network architecture for trainable resource allocation strategies. Items of value and constraints form the two sets of graph nodes, which are connected by edges corresponding to possible allocations. The GNN is trained on simulations or past problem occurrences to maximize any user-supplied, scientifically motivated objective function, augmented by an infeasibility penalty. The amount of feasibility violation can be tuned in relation to any available slack in the system. We apply this method to optimize the astronomical target selection strategy for the highly multiplexed Subaru Prime Focus Spectrograph instrument, where it shows superior results to direct gradient descent optimization and extends the capabilities of the currently employed solver which uses linear objective functions. The development of this method enables fast adjustment and deployment of allocation strategies, statistical analyses of allocation patterns, and fully differentiable, science-driven solutions for resource allocation problems.
LGJun 17, 2021
Unsupervised Resource Allocation with Graph Neural NetworksMiles Cranmer, Peter Melchior, Brian Nord
We present an approach for maximizing a global utility function by learning how to allocate resources in an unsupervised way. We expect interactions between allocation targets to be important and therefore propose to learn the reward structure for near-optimal allocation policies with a GNN. By relaxing the resource constraint, we can employ gradient-based optimization in contrast to more standard evolutionary algorithms. Our algorithm is motivated by a problem in modern astronomy, where one needs to select-based on limited initial information-among $10^9$ galaxies those whose detailed measurement will lead to optimal inference of the composition of the universe. Our technique presents a way of flexibly learning an allocation strategy by only requiring forward simulators for the physics of interest and the measurement process. We anticipate that our technique will also find applications in a range of resource allocation problems.
IMDec 9, 2019
Hybrid Physical-Deep Learning Model for Astronomical Inverse ProblemsFrancois Lanusse, Peter Melchior, Fred Moolekamp
We present a Bayesian machine learning architecture that combines a physically motivated parametrization and an analytic error model for the likelihood with a deep generative model providing a powerful data-driven prior for complex signals. This combination yields an interpretable and differentiable generative model, allows the incorporation of prior knowledge, and can be utilized for observations with different data quality without having to retrain the deep network. We demonstrate our approach with an example of astronomical source separation in current imaging data, yielding a physical and interpretable model of astronomical scenes.