OC LG MLApr 3, 2020

Dualize, Split, Randomize: Toward Fast Nonsmooth Optimization Algorithms

Adil Salim, Laurent Condat, Konstantin Mishchenko, Peter Richtárik

arXiv:2004.02635v414.334 citations

Originality Incremental advance

AI Analysis

This work addresses optimization challenges in fields like image processing and machine learning by providing a unified algorithmic framework, though it is incremental as it builds on and extends existing primal-dual methods.

The authors tackled the problem of minimizing a sum of three convex functions, including smooth and nonsmooth components, by proposing a new primal-dual algorithm called PDDY, which unifies existing methods and achieves sublinear convergence rates in general and linear convergence under strong convexity, with applications in image processing and machine learning.

We consider minimizing the sum of three convex functions, where the first one F is smooth, the second one is nonsmooth and proximable and the third one is the composition of a nonsmooth proximable function with a linear operator L. This template problem has many applications, for instance, in image processing and machine learning. First, we propose a new primal-dual algorithm, which we call PDDY, for this problem. It is constructed by applying Davis-Yin splitting to a monotone inclusion in a primal-dual product space, where the operators are monotone under a specific metric depending on L. We show that three existing algorithms (the two forms of the Condat-Vu algorithm and the PD3O algorithm) have the same structure, so that PDDY is the fourth missing link in this self-consistent class of primal-dual algorithms. This representation eases the convergence analysis: it allows us to derive sublinear convergence rates in general, and linear convergence results in presence of strong convexity. Moreover, within our broad and flexible analysis framework, we propose new stochastic generalizations of the algorithms, in which a variance-reduced random estimate of the gradient of F is used, instead of the true gradient. Furthermore, we obtain, as a special case of PDDY, a linearly converging algorithm for the minimization of a strongly convex function F under a linear constraint; we discuss its important application to decentralized optimization.

View on arXiv PDF

Similar