Piotr Białas

STAT-MECH
h-index2
6papers
36citations
Novelty36%
AI Score41

6 Papers

STAT-MECHMar 21, 2022
Hierarchical autoregressive neural networks for statistical systems

Piotr Białas, Piotr Korcyl, Tomasz Stebel

It was recently proposed that neural networks could be used to approximate many-dimensional probability distributions that appear e.g. in lattice field theories or statistical mechanics. Subsequently they can be used as variational approximators to asses extensive properties of statistical systems, like free energy, and also as neural samplers used in Monte Carlo simulations. The practical application of this approach is unfortunately limited by its unfavorable scaling both of the numerical cost required for training, and the memory requirements with the system size. This is due to the fact that the original proposition involved a neural network of width which scaled with the total number of degrees of freedom, e.g. $L^2$ in case of a two dimensional $L\times L$ lattice. In this work we propose a hierarchical association of physical degrees of freedom, for instance spins, to neurons which replaces it with the scaling with the linear extent $L$ of the system. We demonstrate our approach on the two-dimensional Ising model by simulating lattices of various sizes up to $128 \times 128$ spins, with time benchmarks reaching lattices of size $512 \times 512$. We observe that our proposal improves the quality of neural network training, i.e. the approximated probability distribution is closer to the target that could be previously achieved. As a consequence, the variational free energy reaches a value closer to its theoretical expectation and, if applied in a Markov Chain Monte Carlo algorithm, the resulting autocorrelation time is smaller. Finally, the replacement of a single neural network by a hierarchy of smaller networks considerably reduces the memory requirements.

32.4LGMay 15
Variational Autoregressive Networks with probability priors

Piotr Białas, Piotr Korcyl, Tomasz Stebel et al.

Monte Carlo methods are essential across diverse scientific fields, yet their efficiency is frequently hampered by critical slowing down-a sharp increase in autocorrelation times near phase transitions. Although deep learning approaches, such as neural-network-based samplers, have been proposed to alleviate this issue, they face another serious problem: the difficulty of training the models. This difficulty partially stems from the overly general nature of original machine-learning architectures, which often ignore underlying physical symmetries and force networks to relearn them from scratch. In this paper, we demonstrate that incorporating physical priors into the model significantly enhances performance. Building upon existing strategies that integrate spin-spin interactions, we propose a framework that utilizes a prior probability distribution as a starting point for training. Our results for the Ising model, as well as for the Edwards-Anderson spin glass model, suggest that moving away from `blank slate' models in favor of physics-informed priors reduces the training burden and facilitates the simulation of larger system sizes in discrete spin models.

67.2DIS-NNApr 30
Sampling two-dimensional spin systems with transformers

Piotr Białas, Piotr Korcyl, Tomasz Stebel et al.

Autoregressive Neural Networks based on dense or convolutional layers have recently been shown to be a viable strategy for generating classical spin systems. Unlike these methods, sampling with transformers is commonly considered to be computationally inefficient. In this work, we propose a novel approach to transformer-based neural samplers in which we generate not a single spin per step but groups of spins. As an additional improvement, we construct a model of approximated probabilities, further improving the efficiency of the algorithm. Despite our approach being computationally heavier than dense networks or CNN-based approaches, we were able to sample larger systems of up to $180 \times 180$ spins in case of the Ising model. The Effective Sample Size of our sampler is $\sim 20$ times larger than that of the previous state-of-the-art neural sampler when trained for the $128 \times 128$ Ising model at critical temperature. Finally, we also test our algorithm on the 2D Edwards-Anderson model, where we train $64\times 64$ spin systems.

STAT-MECHMar 11, 2025
Hierarchical autoregressive neural networks in three-dimensional statistical system

Piotr Białas, Vaibhav Chahar, Piotr Korcyl et al.

Autoregressive Neural Networks (ANN) have been recently proposed as a mechanism to improve the efficiency of Monte Carlo algorithms for several spin systems. The idea relies on the fact that the total probability of a configuration can be factorized into conditional probabilities of each spin, which in turn can be approximated by a neural network. Once trained, the ANNs can be used to sample configurations from the approximated probability distribution and to explicitly evaluate this probability for a given configuration. It has also been observed that such conditional probabilities give access to information-theoretic observables such as mutual information or entanglement entropy. In this paper, we describe the hierarchical autoregressive network (HAN) algorithm in three spatial dimensions and study its performance using the example of the Ising model. We compare HAN with three other autoregressive architectures and the classical Wolff cluster algorithm. Finally, we provide estimates of thermodynamic observables for the three-dimensional Ising model, such as entropy and free energy, in a range of temperatures across the phase transition.

QUANT-PHJun 4, 2025
Estimation of the reduced density matrix and entanglement entropies using autoregressive networks

Piotr Białas, Piotr Korcyl, Tomasz Stebel et al.

We present an application of autoregressive neural networks to Monte Carlo simulations of quantum spin chains using the correspondence with classical two-dimensional spin systems. We use a hierarchy of neural networks capable of estimating conditional probabilities of consecutive spins to evaluate elements of reduced density matrices directly. Using the Ising chain as an example, we calculate the continuum limit of the ground state's von Neumann and Rényi bipartite entanglement entropies of an interval built of up to 5 spins. We demonstrate that our architecture is able to estimate all the needed matrix elements with just a single training for a fixed time discretization and lattice volume. Our method can be applied to other types of spin chains, possibly with defects, as well as to estimating entanglement entropies of thermal states at non-zero temperature.

STAT-MECHNov 19, 2021
Analysis of autocorrelation times in Neural Markov Chain Monte Carlo simulations

Piotr Białas, Piotr Korcyl, Tomasz Stebel

We provide a deepened study of autocorrelations in Neural Markov Chain Monte Carlo (NMCMC) simulations, a version of the traditional Metropolis algorithm which employs neural networks to provide independent proposals. We illustrate our ideas using the two-dimensional Ising model. We discuss several estimates of autocorrelation times in the context of NMCMC, some inspired by analytical results derived for the Metropolized Independent Sampler (MIS). We check their reliability by estimating them on a small system where analytical results can also be obtained. Based on the analytical results for MIS we propose a new loss function and study its impact on the autocorelation times. Although, this function's performance is a bit inferior to the traditional Kullback-Leibler divergence, it offers two training algorithms which in some situations may be beneficial. By studying a small, $4 \times 4$, system we gain access to the dynamics of the training process which we visualize using several observables. Furthermore, we quantitatively investigate the impact of imposing global discrete symmetries of the system in the neural network training process on the autocorrelation times. Eventually, we propose a scheme which incorporates partial heat-bath updates which considerably improves the quality of the training. The impact of the above enhancements is discussed for a $16 \times 16$ spin system. The summary of our findings may serve as a guidance to the implementation of Neural Markov Chain Monte Carlo simulations for more complicated models.