ML LGFeb 15, 2021

Tractable structured natural gradient descent using local parameterizations

Wu Lin, Frank Nielsen, Mohammad Emtiyaz Khan, Mark Schmidt

arXiv:2102.07405v1016.936 citations

Originality Incremental advance

AI Analysis

This work addresses a bottleneck in scalable optimization for machine learning practitioners, though it appears incremental as it builds on and extends existing geometric methods.

The paper tackles the computational challenge of natural-gradient descent on structured parameter spaces by introducing a method using local-parameter coordinates, resulting in a flexible and efficient approach that generalizes existing algorithms and yields new ones for applications in deep learning, variational inference, and evolution strategies.

Natural-gradient descent (NGD) on structured parameter spaces (e.g., low-rank covariances) is computationally challenging due to difficult Fisher-matrix computations. We address this issue by using \emph{local-parameter coordinates} to obtain a flexible and efficient NGD method that works well for a wide-variety of structured parameterizations. We show four applications where our method (1) generalizes the exponential natural evolutionary strategy, (2) recovers existing Newton-like algorithms, (3) yields new structured second-order algorithms via matrix groups, and (4) gives new algorithms to learn covariances of Gaussian and Wishart-based distributions. We show results on a range of problems from deep learning, variational inference, and evolution strategies. Our work opens a new direction for scalable structured geometric methods.

View on arXiv PDF

Similar