OCLGFeb 29, 2020

Conjugate-gradient-based Adam for stochastic optimization and its application to deep learning

arXiv:2003.00231v27 citations
AI Analysis

This work addresses training efficiency for deep learning practitioners, but it is incremental as it combines existing methods.

The paper tackled the problem of slow training in deep neural networks by proposing a conjugate-gradient-based Adam algorithm, which reduced the number of training epochs compared to existing adaptive stochastic optimization methods in text and image classification tasks.

This paper proposes a conjugate-gradient-based Adam algorithm blending Adam with nonlinear conjugate gradient methods and shows its convergence analysis. Numerical experiments on text classification and image classification show that the proposed algorithm can train deep neural network models in fewer epochs than the existing adaptive stochastic optimization algorithms can.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes