OCDSLGMLNov 4, 2021

Big-Step-Little-Step: Efficient Gradient Methods for Objectives with Multiple Scales

arXiv:2111.03137v18 citations
Originality Highly original
AI Analysis

This work addresses optimization efficiency for problems with multiple scales, offering significant speedups in computational complexity for convex optimization tasks.

The paper tackles the problem of efficiently solving a class of ill-conditioned optimization problems where the objective function decomposes into multiple non-interacting components, achieving a complexity bound that scales with the product of square roots of component condition numbers, which can improve almost exponentially over accelerated gradient methods.

We provide new gradient-based methods for efficiently solving a broad class of ill-conditioned optimization problems. We consider the problem of minimizing a function $f : \mathbb{R}^d \rightarrow \mathbb{R}$ which is implicitly decomposable as the sum of $m$ unknown non-interacting smooth, strongly convex functions and provide a method which solves this problem with a number of gradient evaluations that scales (up to logarithmic factors) as the product of the square-root of the condition numbers of the components. This complexity bound (which we prove is nearly optimal) can improve almost exponentially on that of accelerated gradient methods, which grow as the square root of the condition number of $f$. Additionally, we provide efficient methods for solving stochastic, quadratic variants of this multiscale optimization problem. Rather than learn the decomposition of $f$ (which would be prohibitively expensive), our methods apply a clean recursive "Big-Step-Little-Step" interleaving of standard methods. The resulting algorithms use $\tilde{\mathcal{O}}(d m)$ space, are numerically stable, and open the door to a more fine-grained understanding of the complexity of convex optimization beyond condition number.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes