MLLGOct 18, 2023

Optimising Distributions with Natural Gradient Surrogates

arXiv:2310.11837v21 citationsh-index: 2
Originality Highly original
AI Analysis

This work addresses a bottleneck in machine learning for researchers and practitioners using natural gradient methods, offering a generalizable solution that simplifies implementation without requiring extensive derivations.

The paper tackles the challenge of computing natural gradients for optimizing probability distributions by proposing a technique that reframes optimization using surrogate distributions, enabling efficient natural gradient methods for a wider variety of distributions, as demonstrated on maximum likelihood estimation and variational inference tasks.

Natural gradient methods have been used to optimise the parameters of probability distributions in a variety of settings, often resulting in fast-converging procedures. Unfortunately, for many distributions of interest, computing the natural gradient has a number of challenges. In this work we propose a novel technique for tackling such issues, which involves reframing the optimisation as one with respect to the parameters of a surrogate distribution, for which computing the natural gradient is easy. We give several examples of existing methods that can be interpreted as applying this technique, and propose a new method for applying it to a wide variety of problems. Our method expands the set of distributions that can be efficiently targeted with natural gradients. Furthermore, it is fast, easy to understand, simple to implement using standard autodiff software, and does not require lengthy model-specific derivations. We demonstrate our method on maximum likelihood estimation and variational inference tasks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes