MLMay 28
Wasserstein Contraction of Coordinate Ascent Variational InferenceRocco Caprio, Adrien Corenflos, Sam Power
We study the contraction in Wasserstein distance of the coordinate ascent variational inference algorithm. This is shown to hold under a transport-information inequality at the fixed points and a functional smoothness condition. The results are general and sharp, allow for local convergence guarantees, hold for general smooth manifolds, and also in some non-smooth spaces. We consider applications to Bayesian Gaussian Mixture Models, and high-dimensional Bayesian Probit Regression, and Logistic Regression with Pólya-Gamma random variables (i.e. Jaakkola-Jordan's algorithm).
MLJul 25, 2024
Fast convergence of the Expectation Maximization algorithm under a logarithmic Sobolev inequalityRocco Caprio, Adam M Johansen
We present a new framework for analysing the Expectation Maximization (EM) algorithm. Drawing on recent advances in the theory of gradient flows over Euclidean-Wasserstein spaces, we extend techniques from alternating minimization in Euclidean spaces to the EM algorithm, via its representation as coordinate-wise minimization of the free energy. In so doing, we obtain finite sample error bounds and exponential convergence of the EM algorithm under a natural generalisation of the log-Sobolev inequality. We further show that this framework naturally extends to several variants of EM, offering a unified approach for studying such algorithms.
LGMar 4, 2024
Error bounds for particle gradient descent, and extensions of the log-Sobolev and Talagrand inequalitiesRocco Caprio, Juan Kuntz, Samuel Power et al.
We prove non-asymptotic error bounds for particle gradient descent (PGD, Kuntz et al., 2023), a recently introduced algorithm for maximum likelihood estimation of large latent variable models obtained by discretizing a gradient flow of the free energy. We begin by showing that the flow converges exponentially fast to the free energy's minimizers for models satisfying a condition that generalizes both the log-Sobolev and the Polyak--Łojasiewicz inequalities (LSI and PŁI, respectively). We achieve this by extending a result well-known in the optimal transport literature (that the LSI implies the Talagrand inequality) and its counterpart in the optimization literature (that the PŁI implies the so-called quadratic growth condition), and applying the extension to our new setting. We also generalize the Bakry--Émery Theorem and show that the LSI/PŁI extension holds for models with strongly concave log-likelihoods. For such models, we further control PGD's discretization error and obtain the non-asymptotic error bounds. While we are motivated by the study of PGD, we believe that the inequalities and results we extend may be of independent interest.