Gert Aarts

HEP-LAT

h-index119

20papers

248citations

Novelty40%

AI Score42

Ranked #85,374 of 205,806 authors (top 41%)#8 in HEP-LAT (top 21%)

20 Papers

HEP-LATSep 29, 2023

Diffusion Models as Stochastic Quantization in Lattice Field Theory

Lingxiao Wang, Gert Aarts, Kai Zhou

In this work, we establish a direct connection between generative diffusion models (DMs) and stochastic quantization (SQ). The DM is realized by approximating the reversal of a stochastic process dictated by the Langevin equation, generating samples from a prior distribution to effectively mimic the target distribution. Using numerical simulations, we demonstrate that the DM can serve as a global sampler for generating quantum lattice field configurations in two-dimensional $φ^4$ theory. We demonstrate that DMs can notably reduce autocorrelation times in the Markov chain, especially in the critical region where standard Markov Chain Monte-Carlo (MCMC) algorithms experience critical slowing down. The findings can potentially inspire further advancements in lattice field theory simulations, in particular in cases where it is expensive to generate large ensembles.

HEP-LATNov 6, 2023

Generative Diffusion Models for Lattice Field Theory

Lingxiao Wang, Gert Aarts, Kai Zhou

This study delves into the connection between machine learning and lattice field theory by linking generative diffusion models (DMs) with stochastic quantization, from a stochastic differential equation perspective. We show that DMs can be conceptualized by reversing a stochastic process driven by the Langevin equation, which then produces samples from an initial distribution to approximate the target distribution. In a toy model, we highlight the capability of DMs to learn effective actions. Furthermore, we demonstrate its feasibility to act as a global sampler for generating configurations in the two-dimensional $φ^4$ quantum lattice field theory.

DIS-NNJul 23, 2024

Stochastic weight matrix dynamics during learning and Dyson Brownian motion

Gert Aarts, Biagio Lucini, Chanju Park

We demonstrate that the update of weight matrices in learning algorithms can be described in the framework of Dyson Brownian motion, thereby inheriting many features of random matrix theory. We relate the level of stochasticity to the ratio of the learning rate and the mini-batch size, providing more robust evidence to a previously conjectured scaling relationship. We discuss universal and non-universal features in the resulting Coulomb gas distribution and identify the Wigner surmise and Wigner semicircle explicitly in a teacher-student model and in the (near-)solvable case of the Gaussian restricted Boltzmann machine.

HEP-LATJan 27

Generalizable Equivariant Diffusion Models for Non-Abelian Lattice Gauge Theory

Gert Aarts, Diaa E. Habibi, Andreas Ipp et al.

We demonstrate that gauge equivariant diffusion models can accurately model the physics of non-Abelian lattice gauge theory using the Metropolis-adjusted annealed Langevin algorithm (MAALA), as exemplified by computations in two-dimensional U(2) and SU(2) gauge theories. Our network architecture is based on lattice gauge equivariant convolutional neural networks (L-CNNs), which respect local and global symmetries on the lattice. Models are trained on a single ensemble generated using a traditional Monte Carlo method. By studying Wilson loops of various size as well as the topological susceptibility, we find that the diffusion approach generalizes remarkably well to larger inverse couplings and lattice sizes with negligible loss of accuracy while retaining moderately high acceptance rates.

HEP-LATJan 9, 2025

Physics-Driven Learning for Inverse Problems in Quantum Chromodynamics

Gert Aarts, Kenji Fukushima, Tetsuo Hatsuda et al.

The integration of deep learning techniques and physics-driven designs is reforming the way we address inverse problems, in which accurate physical properties are extracted from complex data sets. This is particularly relevant for quantum chromodynamics (QCD), the theory of strong interactions, with its inherent limitations in observational data and demanding computational approaches. This perspective highlights advances and potential of physics-driven learning methods, focusing on predictions of physical quantities towards QCD physics, and drawing connections to machine learning(ML). It is shown that the fusion of ML and physics can lead to more efficient and reliable problem-solving strategies. Key ideas of ML, methodology of embedding physics priors, and generative models as inverse modelling of physical probability distributions are introduced. Specific applications cover first-principle lattice calculations, and QCD physics of hadrons, neutron stars, and heavy-ion collisions. These examples provide a structured and concise overview of how incorporating prior knowledge such as symmetry, continuity and equations into deep learning designs can address diverse inverse problems across different physical sciences.

HEP-LATOct 28, 2024

On learning higher-order cumulants in diffusion models

Gert Aarts, Diaa E. Habibi, Lingxiao Wang et al.

To analyse how diffusion models learn correlations beyond Gaussian ones, we study the behaviour of higher-order cumulants, or connected n-point functions, under both the forward and backward process. We derive explicit expressions for the moment- and cumulant-generating functionals, in terms of the distribution of the initial data and properties of forward process. It is shown analytically that during the forward process higher-order cumulants are conserved in models without a drift, such as the variance-expanding scheme, and that therefore the endpoint of the forward process maintains nontrivial correlations. We demonstrate that since these correlations are encoded in the score function, higher-order cumulants are learnt in the backward process, also when starting from a normal prior. We confirm our analytical results in an exactly solvable toy model with nonzero cumulants and in scalar lattice field theory.

HEP-LATFeb 8, 2025

Physics-Conditioned Diffusion Models for Lattice Gauge Theory

Qianteng Zhu, Gert Aarts, Wei Wang et al.

We develop diffusion models for simulating lattice gauge theories, where stochastic quantization is explicitly incorporated as a physical condition for sampling. We demonstrate the applicability of this novel sampler to U(1) gauge theory in two spacetime dimensions and find that a model trained at a small inverse coupling constant can be extrapolated to larger inverse coupling regions without encountering the topological freezing problem. Additionally, the trained model can be employed to sample configurations on different lattice sizes without requiring further training. The exactness of the generated samples is ensured by incorporating Metropolis-adjusted Langevin dynamics into the generation process. Furthermore, we demonstrate that this approach enables more efficient sampling of topological quantities compared to traditional algorithms such as Hybrid Monte Carlo and Langevin simulations.

HEP-LATDec 2, 2024

Diffusion models learn distributions generated by complex Langevin dynamics

Diaa E. Habibi, Gert Aarts, Lingxiao Wang et al.

The probability distribution effectively sampled by a complex Langevin process for theories with a sign problem is not known a priori and notoriously hard to understand. Diffusion models, a class of generative AI, can learn distributions from data. In this contribution, we explore the ability of diffusion models to learn the distributions created by a complex Langevin process.

DIS-NNNov 20, 2024

Dyson Brownian motion and random matrix dynamics of weight matrices during learning

Gert Aarts, Ouraman Hajizadeh, Biagio Lucini et al.

During training, weight matrices in machine learning architectures are updated using stochastic gradient descent or variations thereof. In this contribution we employ concepts of random matrix theory to analyse the resulting stochastic matrix dynamics. We first demonstrate that the dynamics can generically be described using Dyson Brownian motion, leading to e.g. eigenvalue repulsion. The level of stochasticity is shown to depend on the ratio of the learning rate and the mini-batch size, explaining the empirically observed linear scaling rule. We verify this linear scaling in the restricted Boltzmann machine. Subsequently we study weight matrix dynamics in transformers (a nano-GPT), following the evolution from a Marchenko-Pastur distribution for eigenvalues at initialisation to a combination with additional structure at the end of learning.

HEP-LATOct 1, 2025

Combining complex Langevin dynamics with score-based and energy-based diffusion models

Gert Aarts, Diaa E. Habibi, Lingxiao Wang et al.

Theories with a sign problem due to a complex action or Boltzmann weight can sometimes be numerically solved using a stochastic process in the complexified configuration space. However, the probability distribution effectively sampled by this complex Langevin process is not known a priori and notoriously hard to understand. In generative AI, diffusion models can learn distributions, or their log derivatives, from data. We explore the ability of diffusion models to learn the distributions sampled by a complex Langevin process, comparing score-based and energy-based diffusion models, and speculate about possible applications.

DIS-NNSep 1, 2025

Phase diagram and eigenvalue dynamics of stochastic gradient descent in multilayer neural networks

Chanju Park, Biagio Lucini, Gert Aarts

Hyperparameter tuning is one of the essential steps to guarantee the convergence of machine learning models. We argue that intuition about the optimal choice of hyperparameters for stochastic gradient descent can be obtained by studying a neural network's phase diagram, in which each phase is characterised by distinctive dynamics of the singular values of weight matrices. Taking inspiration from disordered systems, we start from the observation that the loss landscape of a multilayer neural network with mean squared error can be interpreted as a disordered system in feature space, where the learnt features are mapped to soft spin degrees of freedom, the initial variance of the weight matrices is interpreted as the strength of the disorder, and temperature is given by the ratio of the learning rate and the batch size. As the model is trained, three phases can be identified, in which the dynamics of weight matrices is qualitatively different. Employing a Langevin equation for stochastic gradient descent, previously derived using Dyson Brownian motion, we demonstrate that the three dynamical regimes can be classified effectively, providing practical guidance for the choice of hyperparameters of the optimiser.

IMMar 18, 2025

Strategic White Paper on AI Infrastructure for Particle, Nuclear, and Astroparticle Physics: Insights from JENA and EuCAIF

Sascha Caron, Andreas Ipp, Gert Aarts et al.

Artificial intelligence (AI) is transforming scientific research, with deep learning methods playing a central role in data analysis, simulations, and signal detection across particle, nuclear, and astroparticle physics. Within the JENA communities-ECFA, NuPECC, and APPEC-and as part of the EuCAIF initiative, AI integration is advancing steadily. However, broader adoption remains constrained by challenges such as limited computational resources, a lack of expertise, and difficulties in transitioning from research and development (R&D) to production. This white paper provides a strategic roadmap, informed by a community survey, to address these barriers. It outlines critical infrastructure requirements, prioritizes training initiatives, and proposes funding strategies to scale AI capabilities across fundamental physics over the next five years.

HEP-LATDec 29, 2024

Random Matrix Theory for Stochastic Gradient Descent

Chanju Park, Matteo Favoni, Biagio Lucini et al.

Investigating the dynamics of learning in machine learning algorithms is of paramount importance for understanding how and why an approach may be successful. The tools of physics and statistics provide a robust setting for such investigations. Here we apply concepts from random matrix theory to describe stochastic weight matrix dynamics, using the framework of Dyson Brownian motion. We derive the linear scaling rule between the learning rate (step size) and the batch size, and identify universal and non-universal aspects of weight matrix dynamics. We test our findings in the (near-)solvable case of the Gaussian Restricted Boltzmann Machine and in a linear one-hidden-layer neural network.

HEP-LATFeb 10, 2022

Applications of Machine Learning to Lattice Quantum Field Theory

Denis Boyda, Salvatore Calì, Sam Foreman et al.

There is great potential to apply machine learning in the area of numerical lattice quantum field theory, but full exploitation of that potential will require new strategies. In this white paper for the Snowmass community planning process, we discuss the unique requirements of machine learning for lattice quantum field theory research and outline what is needed to enable exploration and deployment of this approach in the future.

AIDec 29, 2021

Towards a Shapley Value Graph Framework for Medical peer-influence

Jamie Duell, Monika Seisenberger, Gert Aarts et al.

eXplainable Artificial Intelligence (XAI) is a sub-field of Artificial Intelligence (AI) that is at the forefront of AI research. In XAI, feature attribution methods produce explanations in the form of feature importance. People often use feature importance as guidance for intervention. However, a limitation of existing feature attribution methods is that there is a lack of explanation towards the consequence of intervention. In other words, although contribution towards a certain prediction is highlighted by feature attribution methods, the relation between features and the consequence of intervention is not studied. The aim of this paper is to introduce a new framework, called a peer influence framework to look deeper into explanations using graph representation for feature-to-feature interactions to improve the interpretability of black-box Machine Learning models and inform intervention.

LGOct 21, 2021

Quantum field theories, Markov random fields and machine learning