CODec 2, 2025
Convergence of a class of gradient-free optimisation schemes when the objective function is noisy, irregular, or bothChristophe Andrieu, Nicolas Chopin, Ettore Fincato et al.
We investigate the convergence properties of a class of iterative algorithms designed to minimize a potentially non-smooth and noisy objective function, which may be algebraically intractable and whose values may be obtained as the output of a black box. The algorithms considered can be cast under the umbrella of a generalised gradient descent recursion, where the gradient is that of a smooth approximation of the objective function. The framework we develop includes as special cases model-based and mollification methods, two classical approaches to zero-th order optimisation. The convergence results are obtained under very weak assumptions on the regularity of the objective function and involve a trade-off between the degree of smoothing and size of the steps taken in the parameter updates. As expected, additional assumptions are required in the stochastic case. We illustrate the relevance of these algorithms and our convergence results through a challenging classification example from machine learning.
MLJul 9, 2020
A Global Stochastic Optimization Particle Filter AlgorithmMathieu Gerber, Randal Douc
We introduce a new online algorithm for expected log-likelihood maximization in situations where the objective function is multi-modal and/or has saddle points, that we term G-PFSO. The key element underpinning G-PFSO is a probability distribution which (a) is shown to concentrate on the target parameter value as the sample size increases and (b) can be efficiently estimated by means of a standard particle filter algorithm. This distribution depends on a learning rate, where the faster the learning rate the quicker it concentrates on the desired element of the search space, but the less likely G-PFSO is to escape from a local optimum of the objective function. In order to achieve a fast convergence rate with a slow learning rate, G-PFSO exploits the acceleration property of averaging, well-known in the stochastic gradient literature. Considering several challenging estimation problems, the numerical experiments show that, with high probability, G-PFSO successfully finds the highest mode of the objective function and converges to its global maximizer at the optimal rate. While the focus of this work is expected log-likelihood maximization, the proposed methodology and its theory apply more generally for optimizing a function defined through an expectation.
PRJul 5, 2017
Convergence Results for a Class of Time-Varying Simulated Annealing AlgorithmsMathieu Gerber, Luke Bornn
We provide a set of conditions which ensure the almost sure convergence of a class of simulated annealing algorithms on a bounded set $\mathcal{X}\subset\mathbb{R}^d$ based on a time-varying Markov kernel. The class of algorithms considered in this work encompasses the one studied in Belisle (1992) and Yang (2000) as well as its derandomized version recently proposed by Gerber and Bornn (2016). To the best of our knowledge, the results we derive are the first examples of almost sure convergence results for simulated annealing based on a time-varying kernel. In addition, the assumptions on the Markov kernel and on the cooling schedule have the advantage of being trivial to verify in practice.
COSep 5, 2016
Improving Simulated Annealing through DerandomizationMathieu Gerber, Luke Bornn
We propose and study a version of simulated annealing (SA) on continuous state spaces based on $(t,s)_R$-sequences. The parameter $R\in\bar{\mathbb{N}}$ regulates the degree of randomness of the input sequence, with the case $R=0$ corresponding to IID uniform random numbers and the limiting case $R=\infty$ to $(t,s)$-sequences. Our main result, obtained for rectangular domains, shows that the resulting optimization method, which we refer to as QMC-SA, converges almost surely to the global optimum of the objective function $φ$ for any $R\in\mathbb{N}$. When $φ$ is univariate, we are in addition able to show that the completely deterministic version of QMC-SA is convergent. A key property of these results is that they do not require objective-dependent conditions on the cooling schedule. As a corollary of our theoretical analysis, we provide a new almost sure convergence result for SA which shares this property under minimal assumptions on $φ$. We further explain how our results in fact apply to a broader class of optimization methods including for example threshold accepting, for which to our knowledge no convergence results currently exist. We finally illustrate the superiority of QMC-SA over SA algorithms in a numerical study.
COJun 8, 2015
On Integration Methods Based on Scrambled Nets of Arbitrary SizeMathieu Gerber
We consider the problem of evaluating $I(φ):=\int_{[0,1)^s}φ(x) dx$ for a function $φ\in L^2[0,1)^{s}$. In situations where $I(φ)$ can be approximated by an estimate of the form $N^{-1}\sum_{n=0}^{N-1}φ(x^n)$, with $\{x^n\}_{n=0}^{N-1}$ a point set in $[0,1)^s$, it is now well known that the $O_P(N^{-1/2})$ Monte Carlo convergence rate can be improved by taking for $\{x^n\}_{n=0}^{N-1}$ the first $N=λb^m$ points, $λ\in\{1,\dots,b-1\}$, of a scrambled $(t,s)$-sequence in base $b\geq 2$. In this paper we derive a bound for the variance of scrambled net quadrature rules which is of order $o(N^{-1})$ without any restriction on $N$. As a corollary, this bound allows us to provide simple conditions to get, for any pattern of $N$, an integration error of size $o_P(N^{-1/2})$ for functions that depend on the quadrature size $N$. Notably, we establish that sequential quasi-Monte Carlo (M. Gerber and N. Chopin, 2015, \emph{J. R. Statist. Soc. B, to appear.}) reaches the $o_P(N^{-1/2})$ convergence rate for any values of $N$. In a numerical study, we show that for scrambled net quadrature rules we can relax the constraint on $N$ without any loss of efficiency when the integrand $φ$ is a discontinuous function while, for sequential quasi-Monte Carlo, taking $N=λb^m$ may only provide moderate gains.