MLOCDec 15, 2017

Catalyst Acceleration for First-order Convex Optimization: from Theory to Practice

arXiv:1712.05654v2153 citations
Originality Highly original
AI Analysis

This work addresses the need for efficient optimization in machine learning and related fields by providing a practical acceleration method that applies broadly, though it is incremental as it builds on existing Nesterov acceleration theory.

The paper tackles the problem of accelerating gradient-based optimization methods by introducing Catalyst, a generic scheme that builds upon the inexact accelerated proximal point algorithm to achieve faster convergence for convex objectives. The result includes establishing faster rates for various algorithms like gradient descent and SVRG, with experiments showing practical acceleration, especially for ill-conditioned problems.

We introduce a generic scheme for accelerating gradient-based optimization methods in the sense of Nesterov. The approach, called Catalyst, builds upon the inexact accelerated proximal point algorithm for minimizing a convex objective function, and consists of approximately solving a sequence of well-chosen auxiliary problems, leading to faster convergence. One of the keys to achieve acceleration in theory and in practice is to solve these sub-problems with appropriate accuracy by using the right stopping criterion and the right warm-start strategy. We give practical guidelines to use Catalyst and present a comprehensive analysis of its global complexity. We show that Catalyst applies to a large class of algorithms, including gradient descent, block coordinate descent, incremental algorithms such as SAG, SAGA, SDCA, SVRG, MISO/Finito, and their proximal variants. For all of these methods, we establish faster rates using the Catalyst acceleration, for strongly convex and non-strongly convex objectives. We conclude with extensive experiments showing that acceleration is useful in practice, especially for ill-conditioned problems.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes