James Cuin

h-index6
2papers

2 Papers

MLJan 29
Efficient Stochastic Optimisation via Sequential Monte Carlo

James Cuin, Davide Carbone, Yanbo Tang et al.

The problem of optimising functions with intractable gradients frequently arise in machine learning and statistics, ranging from maximum marginal likelihood estimation procedures to fine-tuning of generative models. Stochastic approximation methods for this class of problems typically require inner sampling loops to obtain (biased) stochastic gradient estimates, which rapidly becomes computationally expensive. In this work, we develop sequential Monte Carlo (SMC) samplers for optimisation of functions with intractable gradients. Our approach replaces expensive inner sampling methods with efficient SMC approximations, which can result in significant computational gains. We establish convergence results for the basic recursions defined by our methodology which SMC samplers approximate. We demonstrate the effectiveness of our approach on the reward-tuning of energy-based models within various settings.

LGMar 2
Manifold Aware Denoising Score Matching (MAD)

Alona Levy-Jurgenson, Alvaro Prat, James Cuin et al.

A major focus in designing methods for learning distributions defined on manifolds is to alleviate the need to implicitly learn the manifold so that learning can concentrate on the data distribution within the manifold. However, accomplishing this often leads to compute-intensive solutions. In this work, we propose a simple modification to denoising score-matching in the ambient space to implicitly account for the manifold, thereby reducing the burden of learning the manifold while maintaining computational efficiency. Specifically, we propose a simple decomposition of the score function into a known component $s^{base}$ and a remainder component $s-s^{base}$ (the learning target), with the former implicitly including information on where the data manifold resides. We derive known components $s^{base}$ in analytical form for several important cases, including distributions over rotation matrices and discrete distributions, and use them to demonstrate the utility of this approach in those cases.