OCLGNAFeb 13, 2022

Efficient Natural Gradient Descent Methods for Large-Scale PDE-Based Optimization Problems

arXiv:2202.06236v314 citations
AI Analysis

This work addresses the problem of scaling natural gradient methods for researchers and practitioners in optimization and PDE-based applications, though it is incremental as it builds on existing NGD concepts with computational improvements.

The authors tackled the computational challenge of implementing natural gradient descent (NGD) for large-scale PDE-based optimization by representing the natural gradient as a least-squares problem, enabling efficient computation without direct matrix inversion. They achieved the ability to compute Wasserstein NGD in thousands of dimensions, which was previously considered infeasible.

We propose efficient numerical schemes for implementing the natural gradient descent (NGD) for a broad range of metric spaces with applications to PDE-based optimization problems. Our technique represents the natural gradient direction as a solution to a standard least-squares problem. Hence, instead of calculating, storing, or inverting the information matrix directly, we apply efficient methods from numerical linear algebra. We treat both scenarios where the Jacobian, i.e., the derivative of the state variable with respect to the parameter, is either explicitly known or implicitly given through constraints. We can thus reliably compute several natural NGDs for a large-scale parameter space. In particular, we are able to compute Wasserstein NGD in thousands of dimensions, which was believed to be out of reach. Finally, our numerical results shed light on the qualitative differences between the standard gradient descent and various NGD methods based on different metric spaces in nonconvex optimization problems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes