LGMLAug 19, 2024

Second-Order Forward-Mode Automatic Differentiation for Optimization

arXiv:2408.10419v15 citationsh-index: 8Has Code
Originality Highly original
AI Analysis

This work addresses optimization challenges in machine learning by providing a novel method for second-order optimization without backpropagation, which could benefit researchers and practitioners dealing with memory constraints.

The paper tackles the problem of optimizing machine learning models without backpropagation by introducing a second-order hyperplane search that generalizes line search to a k-dimensional hyperplane, combined with forward-mode stochastic gradient methods, resulting in an algorithm that avoids backpropagation's storage overhead.

This paper introduces a second-order hyperplane search, a novel optimization step that generalizes a second-order line search from a line to a $k$-dimensional hyperplane. This, combined with the forward-mode stochastic gradient method, yields a second-order optimization algorithm that consists of forward passes only, completely avoiding the storage overhead of backpropagation. Unlike recent work that relies on directional derivatives (or Jacobian--Vector Products, JVPs), we use hyper-dual numbers to jointly evaluate both directional derivatives and their second-order quadratic terms. As a result, we introduce forward-mode weight perturbation with Hessian information (FoMoH). We then use FoMoH to develop a novel generalization of line search by extending it to a hyperplane search. We illustrate the utility of this extension and how it might be used to overcome some of the recent challenges of optimizing machine learning models without backpropagation. Our code is open-sourced at https://github.com/SRI-CSL/fomoh.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes