OCLGMar 13, 2025

Adaptive Moment Estimation Optimization Algorithm Using Projection Gradient for Deep Learning

arXiv:2503.10005v11 citationsh-index: 2International Conference on Applied Mathematics, Modelling and Intelligent Computing
Originality Incremental advance
AI Analysis

This is an incremental improvement for deep learning practitioners, offering a new optimization method to boost training efficiency and model performance.

The paper tackles the challenge of training deep neural networks by proposing PadamP, a novel optimization algorithm that accelerates training and enhances performance, with experiments showing it outperforms existing algorithms in convergence speed and generalization on datasets like CIFAR-10 and CIFAR-100.

Training deep neural networks is challenging. To accelerate training and enhance performance, we propose PadamP, a novel optimization algorithm. PadamP is derived by applying the adaptive estimation of the p-th power of the second-order moments under scale invariance, enhancing projection adaptability by modifying the projection discrimination condition. It is integrated into Adam-type algorithms, accelerating training, boosting performance, and improving generalization in deep learning. Combining projected gradient benefits with adaptive moment estimation, PadamP tackles unconstrained non-convex problems. Convergence for the non-convex case is analyzed, focusing on the decoupling of first-order moment estimation coefficients and second-order moment estimation coefficients. Unlike prior work relying on , our proof generalizes the convergence theorem, enhancing practicality. Experiments using VGG-16 and ResNet-18 on CIFAR-10 and CIFAR-100 show PadamP's effectiveness, with notable performance on CIFAR-10/100, especially for VGG-16. The results demonstrate that PadamP outperforms existing algorithms in terms of convergence speed and generalization ability, making it a valuable addition to the field of deep learning optimization.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes