Huaiyi Mu, Yujie Tang, Jie Song et al.
This paper investigates distributed zeroth-order optimization for smooth nonconvex problems, targeting the trade-off between convergence rate and sampling cost per zeroth-order gradient estimation in current algorithms that use either the $2$-point or $2d$-point gradient estimators. We propose a novel variance-reduced gradient estimator that either randomly renovates a single orthogonal direction of the true gradient or calculates the gradient estimation across all dimensions for variance correction, based on a Bernoulli distribution. Integrating this estimator with gradient tracking mechanism allows us to address the trade-off. We show that the oracle complexity of our proposed algorithm is upper bounded by $O(d/ε)$ for smooth nonconvex functions and by $O(dκ\ln (1/ε))$ for smooth and gradient dominated nonconvex functions, where $d$ denotes the problem dimension and $κ$ is the condition number. Numerical simulations comparing our algorithm with existing methods confirm the effectiveness and efficiency of the proposed gradient estimator.