LG MLFeb 5, 2019

Modular Block-diagonal Curvature Approximations for Feedforward Architectures

Felix Dangel, Stefan Harmeling, Philipp Hennig

arXiv:1902.01813v311.116 citationsHas Code

Originality Incremental advance

AI Analysis

This work provides a modular method for curvature approximations in machine learning, which is incremental as it builds on and generalizes existing block-diagonal approaches.

The authors tackled the problem of computing block-diagonal approximations to curvature matrices (e.g., Hessian) in feedforward architectures by proposing a modular extension of backpropagation, which simplifies manual derivations and integrates easily into existing libraries.

We propose a modular extension of backpropagation for the computation of block-diagonal approximations to various curvature matrices of the training objective (in particular, the Hessian, generalized Gauss-Newton, and positive-curvature Hessian). The approach reduces the otherwise tedious manual derivation of these matrices into local modules, and is easy to integrate into existing machine learning libraries. Moreover, we develop a compact notation derived from matrix differential calculus. We outline different strategies applicable to our method. They subsume recently-proposed block-diagonal approximations as special cases, and are extended to convolutional neural networks in this work.

View on arXiv PDF Code

Similar