OCLGOct 8, 2021

Heavy Ball Momentum for Conditional Gradient

arXiv:2110.04243v19 citations
Originality Incremental advance
AI Analysis

This work addresses an incremental improvement for machine learning and signal processing applications using conditional gradient methods, focusing on enhancing convergence properties.

The paper tackled the limitation that momentum cannot generally improve the convergence rate of Frank-Wolfe (FW) algorithms by introducing heavy ball momentum, establishing that it offers a unifying perspective on primal-dual convergence and enjoys a tighter per iteration primal-dual error rate for multiple step sizes, with numerical results demonstrating its usefulness.

Conditional gradient, aka Frank Wolfe (FW) algorithms, have well-documented merits in machine learning and signal processing applications. Unlike projection-based methods, momentum cannot improve the convergence rate of FW, in general. This limitation motivates the present work, which deals with heavy ball momentum, and its impact to FW. Specifically, it is established that heavy ball offers a unifying perspective on the primal-dual (PD) convergence, and enjoys a tighter per iteration PD error rate, for multiple choices of step sizes, where PD error can serve as the stopping criterion in practice. In addition, it is asserted that restart, a scheme typically employed jointly with Nesterov's momentum, can further tighten this PD error bound. Numerical results demonstrate the usefulness of heavy ball momentum in FW iterations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes