LGMar 4, 2018

Accelerating Natural Gradient with Higher-Order Invariance

arXiv:1803.01273v227 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of slow convergence in optimization algorithms for machine learning practitioners by improving invariance properties, though it is incremental as it builds on existing natural gradient methods.

The paper tackles the loss of invariance in natural gradient optimization with finite step sizes by proposing higher-order integrators and geodesic corrections to achieve more invariant trajectories, resulting in faster optimization in deep neural network training and reinforcement learning.

An appealing property of the natural gradient is that it is invariant to arbitrary differentiable reparameterizations of the model. However, this invariance property requires infinitesimal steps and is lost in practical implementations with small but finite step sizes. In this paper, we study invariance properties from a combined perspective of Riemannian geometry and numerical differential equation solving. We define the order of invariance of a numerical method to be its convergence order to an invariant solution. We propose to use higher-order integrators and geodesic corrections to obtain more invariant optimization trajectories. We prove the numerical convergence properties of geodesic corrected updates and show that they can be as computationally efficient as plain natural gradient. Experimentally, we demonstrate that invariance leads to faster optimization and our techniques improve on traditional natural gradient in deep neural network training and natural policy gradient for reinforcement learning.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes