LGNEApr 2, 2022

AdaSmooth: An Adaptive Learning Rate Method based on Effective Ratio

arXiv:2204.00825v15 citationsh-index: 2
Originality Incremental advance
AI Analysis

This addresses the challenge of hyperparameter optimization for practitioners in machine learning, though it appears incremental as it builds on existing adaptive methods.

The paper tackles the problem of tedious hyperparameter tuning in stochastic optimizers like Momentum and AdaGrad by introducing AdaSmooth, an adaptive learning rate method that requires no manual tuning and shows promising results on various neural networks and machine learning tasks.

It is well known that we need to choose the hyper-parameters in Momentum, AdaGrad, AdaDelta, and other alternative stochastic optimizers. While in many cases, the hyper-parameters are tuned tediously based on experience becoming more of an art than science. We present a novel per-dimension learning rate method for gradient descent called AdaSmooth. The method is insensitive to hyper-parameters thus it requires no manual tuning of the hyper-parameters like Momentum, AdaGrad, and AdaDelta methods. We show promising results compared to other methods on different convolutional neural networks, multi-layer perceptron, and alternative machine learning tasks. Empirical results demonstrate that AdaSmooth works well in practice and compares favorably to other stochastic optimization methods in neural networks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes