LGMLFeb 5, 2019

Modular Block-diagonal Curvature Approximations for Feedforward Architectures

arXiv:1902.01813v316 citations
Originality Incremental advance
AI Analysis

This work provides a modular method for curvature approximations in machine learning, which is incremental as it builds on and generalizes existing block-diagonal approaches.

The authors tackled the problem of computing block-diagonal approximations to curvature matrices (e.g., Hessian) in feedforward architectures by proposing a modular extension of backpropagation, which simplifies manual derivations and integrates easily into existing libraries.

We propose a modular extension of backpropagation for the computation of block-diagonal approximations to various curvature matrices of the training objective (in particular, the Hessian, generalized Gauss-Newton, and positive-curvature Hessian). The approach reduces the otherwise tedious manual derivation of these matrices into local modules, and is easy to integrate into existing machine learning libraries. Moreover, we develop a compact notation derived from matrix differential calculus. We outline different strategies applicable to our method. They subsume recently-proposed block-diagonal approximations as special cases, and are extended to convolutional neural networks in this work.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes