MLLGOCOct 29, 2018

Kalman Gradient Descent: Adaptive Variance Reduction in Stochastic Optimization

arXiv:1810.12273v118 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of noisy gradient estimates in stochastic optimization for machine learning practitioners, offering an incremental improvement by extending existing methods like SGD with momentum and RMSProp.

The paper tackles the problem of gradient variance in stochastic optimization by introducing Kalman Gradient Descent, which uses Kalman filtering to adaptively reduce variance, resulting in improved performance demonstrated through theoretical convergence analysis and experiments in areas like neural networks and variational inference.

We introduce Kalman Gradient Descent, a stochastic optimization algorithm that uses Kalman filtering to adaptively reduce gradient variance in stochastic gradient descent by filtering the gradient estimates. We present both a theoretical analysis of convergence in a non-convex setting and experimental results which demonstrate improved performance on a variety of machine learning areas including neural networks and black box variational inference. We also present a distributed version of our algorithm that enables large-dimensional optimization, and we extend our algorithm to SGD with momentum and RMSProp.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes