OC LGApr 20, 2018

Stochastic subgradient method converges on tame functions

Damek Davis, Dmitriy Drusvyatskiy, Sham Kakade, Jason D. Lee

arXiv:1804.07795v341.4317 citations

Originality Highly original

AI Analysis

This provides foundational convergence guarantees for stochastic optimization in machine learning, addressing a key theoretical gap for non-convex problems like deep learning.

The paper tackles the problem of establishing convergence guarantees for the stochastic subgradient method in non-smooth, non-convex settings, proving that it produces limit points that are first-order stationary for functions with Whitney stratifiable graphs, including all popular deep learning architectures.

This work considers the question: what convergence guarantees does the stochastic subgradient method have in the absence of smoothness and convexity? We prove that the stochastic subgradient method, on any semialgebraic locally Lipschitz function, produces limit points that are all first-order stationary. More generally, our result applies to any function with a Whitney stratifiable graph. In particular, this work endows the stochastic subgradient method, and its proximal extension, with rigorous convergence guarantees for a wide class of problems arising in data science---including all popular deep learning architectures.

View on arXiv PDF

Similar