OCITMLDec 7, 2013

Optimal rates for zero-order convex optimization: the power of two function evaluations

arXiv:1312.2139v20.00583 citations
AI Analysis70

This provides improved theoretical guarantees for derivative-free optimization, which is incremental but important for applications where gradients are unavailable.

The paper tackles derivative-free convex optimization by showing that using pairs of function evaluations can achieve convergence rates with at most a √d factor penalty compared to gradient methods, establishing both upper bounds and matching lower bounds for smooth and non-smooth cases.

We consider derivative-free algorithms for stochastic and non-stochastic convex optimization problems that use only function values rather than gradients. Focusing on non-asymptotic bounds on convergence rates, we show that if pairs of function values are available, algorithms for $d$-dimensional optimization that use gradient estimates based on random perturbations suffer a factor of at most $\sqrt{d}$ in convergence rate over traditional stochastic gradient methods. We establish such results for both smooth and non-smooth cases, sharpening previous analyses that suggested a worse dimension dependence, and extend our results to the case of multiple ($m \ge 2$) evaluations. We complement our algorithmic development with information-theoretic lower bounds on the minimax convergence rate of such problems, establishing the sharpness of our achievable results up to constant (sometimes logarithmic) factors.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes