DS PRMay 7

Discrete Optimal Transport: Rapid Convergence of Simulated Annealing Algorithms

Yuchen He, Tianhui Jiang, Sihan Wang, Chihao Zhang

arXiv:2605.0587776.5

AI Analysis

Provides the first non-asymptotic convergence guarantees for simulated annealing on discrete state spaces with polynomial bounds, addressing a key theoretical gap for practitioners in statistical physics and optimization.

This paper develops a discrete optimal transport framework to analyze simulated annealing on finite state spaces, proving non-asymptotic convergence guarantees. For the mean-field Ising model, annealed single-site Glauber dynamics achieves ε error in KL divergence in O(n^5β^2/ε) steps at any inverse temperature β≥0; for the mean-field q-state Potts model, annealed (q-1)-block Glauber dynamics achieves ε error in poly(n, β, 1/ε) steps for β≥β_s=q/2.

We develop a discrete optimal transport framework for analyzing simulated annealing algorithms on finite state spaces. Building on the discrete Wasserstein metric introduced by Maas (J. Funct. Anal., 2011), we define a generalized discrete Wasserstein-2 distance and the associated notion of \emph{discrete action} for paths of probability measures on graphs. Using these tools, we establish non-asymptotic convergence guarantees for simulated annealing: the KL divergence between the algorithm's output and the target distribution is controlled by the discrete action of the annealing path. This can be viewed as the discrete counterpart of the action-based analysis of annealed Langevin dynamics in continuous spaces by Guo, Tao, and Chen (ICLR 2025). As applications, we analyze simulated annealing for two fundamental models in statistical physics. For the \emph{mean-field Ising model}, we show that annealed single-site Glauber dynamics achieves $\varepsilon$ error in KL divergence in $O(n^5β^2/\varepsilon)$ steps at \emph{any} inverse temperature $β\ge 0$. For the \emph{mean-field $q$-state Potts model}, we show that annealed $(q-1)$-block Glauber dynamics achieves $\varepsilon$ error in $\mathrm{poly}(n, β, 1/\varepsilon)$ steps for all $β\ge β_{\mathsf{s}}=q/2$, the regime where the disordered phase has completely lost stability. In both cases, the key technical contribution is a polynomial upper bound on the discrete action, obtained by exploiting the symmetry of the model to reduce the analysis to a low-dimensional projected chain.

View on arXiv PDF

Similar