MLLGJun 3, 2025

Large Stepsizes Accelerate Gradient Descent for Regularized Logistic Regression

arXiv:2506.02336v23 citationsh-index: 7
Originality Incremental advance
AI Analysis

This provides a faster optimization method for machine learning practitioners dealing with logistic regression, though it is incremental as it builds on prior work in convex settings.

The paper tackles the problem of accelerating gradient descent for ℓ₂-regularized logistic regression with linearly separable data by using large stepsizes, achieving a convergence rate of Õ(√κ) compared to the classical Õ(κ), and extends this to population risk minimization.

We study gradient descent (GD) with a constant stepsize for $\ell_2$-regularized logistic regression with linearly separable data. Classical theory suggests small stepsizes to ensure monotonic reduction of the optimization objective, achieving exponential convergence in $\widetilde{\mathcal{O}}(κ)$ steps with $κ$ being the condition number. Surprisingly, we show that this can be accelerated to $\widetilde{\mathcal{O}}(\sqrtκ)$ by simply using a large stepsize -- for which the objective evolves nonmonotonically. The acceleration brought by large stepsizes extends to minimizing the population risk for separable distributions, improving on the best-known upper bounds on the number of steps to reach a near-optimum. Finally, we characterize the largest stepsize for the local convergence of GD, which also determines the global convergence in special scenarios. Our results extend the analysis of Wu et al. (2024) from convex settings with minimizers at infinity to strongly convex cases with finite minimizers.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes