LG NAJan 13, 2025

An Enhanced Zeroth-Order Stochastic Frank-Wolfe Framework for Constrained Finite-Sum Optimization

Haishan Ye, Yinghui Huang, Hao Di, Xiangyu Chang

arXiv:2501.07201v21 citationsh-index: 3

AI Analysis

This addresses optimization challenges in high-dimensional ML applications where gradient computations are expensive or unavailable, representing a strong specific gain rather than a broad paradigm shift.

The paper tackles constrained finite-sum optimization problems common in large-scale ML by proposing an enhanced zeroth-order stochastic Frank-Wolfe framework with double variance reduction, achieving query complexities of O(d√n/ε) for convex objectives and O(d³/²√n/ε²) for non-convex objectives.

We propose an enhanced zeroth-order stochastic Frank-Wolfe framework to address constrained finite-sum optimization problems, a structure prevalent in large-scale machine-learning applications. Our method introduces a novel double variance reduction framework that effectively reduces the gradient approximation variance induced by zeroth-order oracles and the stochastic sampling variance from finite-sum objectives. By leveraging this framework, our algorithm achieves significant improvements in query efficiency, making it particularly well-suited for high-dimensional optimization tasks. Specifically, for convex objectives, the algorithm achieves a query complexity of O(d \sqrt{n}/ε) to find an epsilon-suboptimal solution, where d is the dimensionality and n is the number of functions in the finite-sum objective. For non-convex objectives, it achieves a query complexity of O(d^{3/2}\sqrt{n}/ε^2 ) without requiring the computation ofd partial derivatives at each iteration. These complexities are the best known among zeroth-order stochastic Frank-Wolfe algorithms that avoid explicit gradient calculations. Empirical experiments on convex and non-convex machine learning tasks, including sparse logistic regression, robust classification, and adversarial attacks on deep networks, validate the computational efficiency and scalability of our approach. Our algorithm demonstrates superior performance in both convergence rate and query complexity compared to existing methods.

View on arXiv PDF

Similar