Convergence rates for optimised adaptive importance samplers
This work addresses a gap in theoretical understanding of adaptive Monte Carlo methods for statisticians and machine learning practitioners, providing explicit convergence rates but is incremental as it builds on existing variance-minimisation strategies.
The paper tackles the problem of analyzing convergence rates for adaptive importance samplers, specifically introducing optimised adaptive importance samplers (OAIS) based on convex optimisation. It proves non-asymptotic error bounds showing that when the target is in the exponential family, the L2 errors converge to the optimal rate of O(1/√N), and when not, the error increases by a factor √ρ* > 1.
Adaptive importance samplers are adaptive Monte Carlo algorithms to estimate expectations with respect to some target distribution which \textit{adapt} themselves to obtain better estimators over a sequence of iterations. Although it is straightforward to show that they have the same $\mathcal{O}(1/\sqrt{N})$ convergence rate as standard importance samplers, where $N$ is the number of Monte Carlo samples, the behaviour of adaptive importance samplers over the number of iterations has been left relatively unexplored. In this work, we investigate an adaptation strategy based on convex optimisation which leads to a class of adaptive importance samplers termed \textit{optimised adaptive importance samplers} (OAIS). These samplers rely on the iterative minimisation of the $χ^2$-divergence between an exponential-family proposal and the target. The analysed algorithms are closely related to the class of adaptive importance samplers which minimise the variance of the weight function. We first prove non-asymptotic error bounds for the mean squared errors (MSEs) of these algorithms, which explicitly depend on the number of iterations and the number of samples together. The non-asymptotic bounds derived in this paper imply that when the target belongs to the exponential family, the $L_2$ errors of the optimised samplers converge to the optimal rate of $\mathcal{O}(1/\sqrt{N})$ and the rate of convergence in the number of iterations are explicitly provided. When the target does not belong to the exponential family, the rate of convergence is the same but the asymptotic $L_2$ error increases by a factor $\sqrt{ρ^\star} > 1$, where $ρ^\star - 1$ is the minimum $χ^2$-divergence between the target and an exponential-family proposal.