Variance-Reduced Gradient Estimator for Nonconvex Zeroth-Order Distributed Optimization

Huaiyi Mu, Yujie Tang, Jie Song, Zhongkui Li

arXiv:2409.1956734.15 citationsh-index: 4

AI Analysis

This addresses efficiency challenges in distributed optimization for nonconvex problems, offering incremental improvements in sampling cost and convergence rates.

The paper tackles the trade-off between convergence rate and sampling cost in distributed zeroth-order optimization for smooth nonconvex problems by proposing a novel variance-reduced gradient estimator, achieving oracle complexities of O(d/ε) for smooth nonconvex functions and O(dκ ln(1/ε)) for gradient-dominated ones.

This paper investigates distributed zeroth-order optimization for smooth nonconvex problems, targeting the trade-off between convergence rate and sampling cost per zeroth-order gradient estimation in current algorithms that use either the $2$-point or $2d$-point gradient estimators. We propose a novel variance-reduced gradient estimator that either randomly renovates a single orthogonal direction of the true gradient or calculates the gradient estimation across all dimensions for variance correction, based on a Bernoulli distribution. Integrating this estimator with gradient tracking mechanism allows us to address the trade-off. We show that the oracle complexity of our proposed algorithm is upper bounded by $O(d/Îµ)$ for smooth nonconvex functions and by $O(dÎº\ln (1/Îµ))$ for smooth and gradient dominated nonconvex functions, where $d$ denotes the problem dimension and $Îº$ is the condition number. Numerical simulations comparing our algorithm with existing methods confirm the effectiveness and efficiency of the proposed gradient estimator.

View on arXiv PDF

Similar