Variance-Reduced Gradient Estimator for Nonconvex Zeroth-Order Distributed Optimization

arXiv:2409.1956734.15 citationsh-index: 4
AI Analysis

This addresses efficiency challenges in distributed optimization for nonconvex problems, offering incremental improvements in sampling cost and convergence rates.

The paper tackles the trade-off between convergence rate and sampling cost in distributed zeroth-order optimization for smooth nonconvex problems by proposing a novel variance-reduced gradient estimator, achieving oracle complexities of O(d/ε) for smooth nonconvex functions and O(dκ ln(1/ε)) for gradient-dominated ones.

This paper investigates distributed zeroth-order optimization for smooth nonconvex problems, targeting the trade-off between convergence rate and sampling cost per zeroth-order gradient estimation in current algorithms that use either the $2$-point or $2d$-point gradient estimators. We propose a novel variance-reduced gradient estimator that either randomly renovates a single orthogonal direction of the true gradient or calculates the gradient estimation across all dimensions for variance correction, based on a Bernoulli distribution. Integrating this estimator with gradient tracking mechanism allows us to address the trade-off. We show that the oracle complexity of our proposed algorithm is upper bounded by $O(d/ε)$ for smooth nonconvex functions and by $O(dκ\ln (1/ε))$ for smooth and gradient dominated nonconvex functions, where $d$ denotes the problem dimension and $κ$ is the condition number. Numerical simulations comparing our algorithm with existing methods confirm the effectiveness and efficiency of the proposed gradient estimator.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes