LGMLOct 18, 2018

First-order and second-order variants of the gradient descent in a unified framework

arXiv:1810.08102v47 citations
Originality Synthesis-oriented
AI Analysis

This work provides a theoretical unification for machine learning practitioners, but it is incremental as it synthesizes existing methods without introducing new algorithms.

The paper tackles the problem of unifying various gradient descent variants by proposing a general framework that interprets six methods, including vanilla gradient descent and Newton's method, as instances of the same approach, and explains their specificities and conditions for coincidence.

In this paper, we provide an overview of first-order and second-order variants of the gradient descent method that are commonly used in machine learning. We propose a general framework in which 6 of these variants can be interpreted as different instances of the same approach. They are the vanilla gradient descent, the classical and generalized Gauss-Newton methods, the natural gradient descent method, the gradient covariance matrix approach, and Newton's method. Besides interpreting these methods within a single framework, we explain their specificities and show under which conditions some of them coincide.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes