OC IT LG MLDec 5, 2019

Lower Bounds for Non-Convex Stochastic Optimization

Yossi Arjevani, Yair Carmon, John C. Duchi, Dylan J. Foster, Nathan Srebro, Blake Woodworth

arXiv:1912.02365v244.9474 citations

Originality Highly original

AI Analysis

This work addresses the fundamental complexity of stochastic optimization for non-convex problems, providing theoretical limits that guide algorithm design and analysis in machine learning and AI.

The paper establishes worst-case lower bounds on the number of stochastic gradient queries needed to find ε-stationary points in non-convex optimization, proving a tight bound of ε^{-4} queries in a general model and ε^{-3} in a more restrictive model, confirming the optimality of stochastic gradient descent and variance reduction methods.

We lower bound the complexity of finding $ε$-stationary points (with gradient norm at most $ε$) using stochastic first-order methods. In a well-studied model where algorithms access smooth, potentially non-convex functions through queries to an unbiased stochastic gradient oracle with bounded variance, we prove that (in the worst case) any algorithm requires at least $ε^{-4}$ queries to find an $ε$ stationary point. The lower bound is tight, and establishes that stochastic gradient descent is minimax optimal in this model. In a more restrictive model where the noisy gradient estimates satisfy a mean-squared smoothness property, we prove a lower bound of $ε^{-3}$ queries, establishing the optimality of recently proposed variance reduction techniques.

View on arXiv PDF

Similar