LG OC MLNov 10, 2024

An Energy-Based Self-Adaptive Learning Rate for Stochastic Gradient Descent: Enhancing Unconstrained Optimization with VAV method

arXiv:2411.06573v12.6h-index: 13

Originality Incremental advance

AI Analysis

This work addresses the problem of efficient and stable optimization for machine learning practitioners, offering an incremental improvement over existing methods like SGD.

The paper tackles the challenge of optimizing learning rates in machine learning by proposing the Vector Auxiliary Variable (VAV) algorithm, which introduces an energy-based self-adjustable learning rate method for unconstrained optimization, resulting in superior stability with larger learning rates and faster early-stage convergence compared to Stochastic Gradient Descent across various tasks.

Optimizing the learning rate remains a critical challenge in machine learning, essential for achieving model stability and efficient convergence. The Vector Auxiliary Variable (VAV) algorithm introduces a novel energy-based self-adjustable learning rate optimization method designed for unconstrained optimization problems. It incorporates an auxiliary variable $r$ to facilitate efficient energy approximation without backtracking while adhering to the unconditional energy dissipation law. Notably, VAV demonstrates superior stability with larger learning rates and achieves faster convergence in the early stage of the training process. Comparative analyses demonstrate that VAV outperforms Stochastic Gradient Descent (SGD) across various tasks. This paper also provides rigorous proof of the energy dissipation law and establishes the convergence of the algorithm under reasonable assumptions. Additionally, $r$ acts as an empirical lower bound of the training loss in practice, offering a novel scheduling approach that further enhances algorithm performance.

View on arXiv PDF

Similar