OC LG MLOct 3, 2019

Escaping Saddle Points for Zeroth-order Nonconvex Optimization using Estimated Gradient Descent

Qinbo Bai, Mridul Agarwal, Vaneet Aggarwal

arXiv:1910.01277v12.0

Originality Highly original

AI Analysis

This addresses the challenge of optimizing nonconvex functions in machine learning applications where gradient information is unavailable, offering a model-free solution with theoretical guarantees.

The paper tackles the problem of nonconvex optimization without gradient access by proposing a method that estimates gradients to perform gradient descent, converging to a stationary point. It shows the algorithm returns an ε-second-order stationary point with Õ(d^(2+θ/2)/ε^(8+θ)) function queries for any θ>0.

Gradient descent and its variants are widely used in machine learning. However, oracle access of gradient may not be available in many applications, limiting the direct use of gradient descent. This paper proposes a method of estimating gradient to perform gradient descent, that converges to a stationary point for general non-convex optimization problems. Beyond the first-order stationary properties, the second-order stationary properties are important in machine learning applications to achieve better performance. We show that the proposed model-free non-convex optimization algorithm returns an $ε$-second-order stationary point with $\widetilde{O}(\frac{d^{2+\fracθ{2}}}{ε^{8+θ}})$ queries of the function for any arbitrary $θ>0$.

View on arXiv PDF

Similar