Quasi-Newton Quasi-Monte Carlo for variational Bayes
This addresses optimization efficiency for practitioners in Bayesian inference, offering incremental improvements in sampling accuracy for second-order methods.
The paper tackles the problem of optimizing noisy objectives in machine learning, such as in variational Bayes, by using randomized quasi-Monte Carlo (RQMC) sampling with stochastic L-BFGS. The result shows that RQMC reduces root mean squared error from O(n^{-1/2}) to potentially O(n^{-3/2}), speeding up optimization and sometimes finding better parameters than Monte Carlo methods.
Many machine learning problems optimize an objective that must be measured with noise. The primary method is a first order stochastic gradient descent using one or more Monte Carlo (MC) samples at each step. There are settings where ill-conditioning makes second order methods such as L-BFGS more effective. We study the use of randomized quasi-Monte Carlo (RQMC) sampling for such problems. When MC sampling has a root mean squared error (RMSE) of $O(n^{-1/2})$ then RQMC has an RMSE of $o(n^{-1/2})$ that can be close to $O(n^{-3/2})$ in favorable settings. We prove that improved sampling accuracy translates directly to improved optimization. In our empirical investigations for variational Bayes, using RQMC with stochastic L-BFGS greatly speeds up the optimization, and sometimes finds a better parameter value than MC does.