Stochastic Compositional Optimization via Hybrid Momentum Frank--Wolfe
It addresses a gap in stochastic compositional optimization by removing the smoothness assumption on the outer function, enabling applications like robust max-of-losses and CVaR.
The paper tackles stochastic compositional optimization where the outer function may be non-smooth, proposing the Hybrid Momentum Stochastic Frank-Wolfe algorithm. It achieves an O(K^{-1/4}) convergence rate in the generalized Frank-Wolfe gap for non-convex objectives, matching optimal complexity for projection-free methods.
Stochastic compositional optimization minimizes objectives of the form $\min_{\bm{x} \in \mathcal{X}} F(\bm{f}(\bm{x}), \bm{x})$, where $\bm{f}$ is accessible only through noisy stochastic queries. Existing methods for this problem assume that the outer function $F$ is continuously differentiable, which excludes many practically important applications such as robust max-of-losses, Conditional Value-at-Risk, and norm regularizers. We propose the Hybrid Momentum Stochastic Frank--Wolfe algorithm, which drops the smoothness assumption on $F$. By combining a momentum-based Jacobian tracker with a Taylor-corrected function tracker, the algorithm feeds an entire stochastic linearization -- rather than a single gradient -- into a generalized linear minimization oracle. We establish an $\mathcal{O}(K^{-1/4})$ convergence rate in the generalized Frank--Wolfe gap for non-convex objectives with $L_F$-Lipschitz outer functions, matching the optimal complexity for projection-free single-sample stochastic methods under expected smoothness. The analysis extends to heavy-tailed noise oracles with bounded $r$-th moments for $r \in (1, 2]$ and recovers the deterministic rates of Vladarean et al (2023) as the noise vanishes.