ML LGMay 22, 2025

Exponential Convergence of CAVI for Bayesian PCA

Arghya Datta, Philippe Gagnon, Florian Maire

arXiv:2505.16145v14.51 citationsh-index: 1

Originality Incremental advance

AI Analysis

This addresses a gap in understanding the convergence speed of a widely used inference method for Bayesian PCA, which is incremental but provides theoretical guarantees for practitioners in machine learning and statistics.

The paper proves exponential convergence of the coordinate ascent variational inference (CAVI) algorithm for Bayesian PCA, establishing a precise result for a single principal component and a more general one for multiple components, with the latter requiring a novel lower bound for the symmetric Kullback-Leibler divergence.

Probabilistic principal component analysis (PCA) and its Bayesian variant (BPCA) are widely used for dimension reduction in machine learning and statistics. The main advantage of probabilistic PCA over the traditional formulation is allowing uncertainty quantification. The parameters of BPCA are typically learned using mean-field variational inference, and in particular, the coordinate ascent variational inference (CAVI) algorithm. So far, the convergence speed of CAVI for BPCA has not been characterized. In our paper, we fill this gap in the literature. Firstly, we prove a precise exponential convergence result in the case where the model uses a single principal component (PC). Interestingly, this result is established through a connection with the classical $\textit{power iteration algorithm}$ and it indicates that traditional PCA is retrieved as points estimates of the BPCA parameters. Secondly, we leverage recent tools to prove exponential convergence of CAVI for the model with any number of PCs, thus leading to a more general result, but one that is of a slightly different flavor. To prove the latter result, we additionally needed to introduce a novel lower bound for the symmetric Kullback--Leibler divergence between two multivariate normal distributions, which, we believe, is of independent interest in information theory.

View on arXiv PDF

Similar