Boosting Frank-Wolfe by Chasing Gradients
This work addresses a bottleneck in optimization algorithms for machine learning and related fields, offering incremental improvements to a widely used method.
The authors tackled the slow convergence rate of the Frank-Wolfe algorithm by proposing a method that aligns descent directions with the negative gradient using a matching pursuit-style subroutine, resulting in convergence rates from O(1/t) to O(e^{-ωt}) and competitive advantages in computational experiments.
The Frank-Wolfe algorithm has become a popular first-order optimization algorithm for it is simple and projection-free, and it has been successfully applied to a variety of real-world problems. Its main drawback however lies in its convergence rate, which can be excessively slow due to naive descent directions. We propose to speed up the Frank-Wolfe algorithm by better aligning the descent direction with that of the negative gradient via a subroutine. This subroutine chases the negative gradient direction in a matching pursuit-style while still preserving the projection-free property. Although the approach is reasonably natural, it produces very significant results. We derive convergence rates $\mathcal{O}(1/t)$ to $\mathcal{O}(e^{-ωt})$ of our method and we demonstrate its competitive advantage both per iteration and in CPU time over the state-of-the-art in a series of computational experiments.