LGMay 9

Spherical Boltzmann machines: a solvable theory of learning and generation in energy-based models

arXiv:2605.0903161.2
AI Analysis

This work provides a theoretical framework for understanding learning and generation in energy-based models, which are widely used but poorly understood.

The authors analyze a solvable energy-based model, the spherical Boltzmann machine, in the high-dimensional limit, deriving exact equations for training dynamics and uncovering phase transitions that explain phenomena like double descent and sampling temperature tuning. These phenomena are shown to extend to standard generative architectures.

Energy-based models (EBMs) are flexible generative architectures inspired by statistical physics, but their learning and generative properties remain poorly understood. Here, we analyze a solvable EBM in the high-dimensional limit: the spherical Boltzmann machine (SBM). Combining tools from random matrix theory and dynamical mean-field theory, we: solve exact equations describing the training dynamics of the SBM; compute the Bayesian evidence, which acts as a partition function in parameter space and encodes global properties of the trained model; and uncover cascades of phase transitions that occur both during training and as a function of hyperparameters, related to successive alignment and condensation of the top modes of the coupling matrix to the data. We connect these transitions to sampling-time generative phenomena in a teacher-student scenario, including: sampling temperature tuning, double descent as a function of regularization strength, tempered posterior effects, and out-of-equilibrium effects during training that induce biases in the trained model. We provide numerical evidence demonstrating that all these phenomena appear in standard generative architectures, beyond the SBM.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes