Sampling with flows, diffusion and autoregressive neural networks: A spin-glass perspective
This provides theoretical insights into the limitations and capabilities of generative models for researchers in machine learning and statistical physics, though it is incremental as it builds on known frameworks.
The paper analyzed the sampling efficiency of modern generative models (flows, diffusion, autoregressive networks) compared to traditional methods like MCMC and Langevin dynamics on spin-glass-like probability distributions, finding that both approaches have complementary strengths and weaknesses due to phase transitions, with each failing in some parameter regions where the other succeeds.
Recent years witnessed the development of powerful generative models based on flows, diffusion or autoregressive neural networks, achieving remarkable success in generating data from examples with applications in a broad range of areas. A theoretical analysis of the performance and understanding of the limitations of these methods remain, however, challenging. In this paper, we undertake a step in this direction by analysing the efficiency of sampling by these methods on a class of problems with a known probability distribution and comparing it with the sampling performance of more traditional methods such as the Monte Carlo Markov chain and Langevin dynamics. We focus on a class of probability distribution widely studied in the statistical physics of disordered systems that relate to spin glasses, statistical inference and constraint satisfaction problems. We leverage the fact that sampling via flow-based, diffusion-based or autoregressive networks methods can be equivalently mapped to the analysis of a Bayes optimal denoising of a modified probability measure. Our findings demonstrate that these methods encounter difficulties in sampling stemming from the presence of a first-order phase transition along the algorithm's denoising path. Our conclusions go both ways: we identify regions of parameters where these methods are unable to sample efficiently, while that is possible using standard Monte Carlo or Langevin approaches. We also identify regions where the opposite happens: standard approaches are inefficient while the discussed generative methods work well.