OC IT MLDec 7, 2013

Optimal rates for zero-order convex optimization: the power of two function evaluations

John C. Duchi, Michael I. Jordan, Martin J. Wainwright, Andre Wibisono

arXiv:1312.2139v20.00583 citations

AI Analysis70

This provides improved theoretical guarantees for derivative-free optimization, which is incremental but important for applications where gradients are unavailable.

The paper tackles derivative-free convex optimization by showing that using pairs of function evaluations can achieve convergence rates with at most a √d factor penalty compared to gradient methods, establishing both upper bounds and matching lower bounds for smooth and non-smooth cases.

We consider derivative-free algorithms for stochastic and non-stochastic convex optimization problems that use only function values rather than gradients. Focusing on non-asymptotic bounds on convergence rates, we show that if pairs of function values are available, algorithms for $d$-dimensional optimization that use gradient estimates based on random perturbations suffer a factor of at most $\sqrt{d}$ in convergence rate over traditional stochastic gradient methods. We establish such results for both smooth and non-smooth cases, sharpening previous analyses that suggested a worse dimension dependence, and extend our results to the case of multiple ($m \ge 2$) evaluations. We complement our algorithmic development with information-theoretic lower bounds on the minimax convergence rate of such problems, establishing the sharpness of our achievable results up to constant (sometimes logarithmic) factors.

View on arXiv PDF

Similar