LGOCMLApr 23, 2019

Stochastic Primal-Dual Algorithms with Faster Convergence than $O(1/\sqrt{T})$ for Problems without Bilinear Structure

arXiv:1904.10112v238 citations
Originality Incremental advance
AI Analysis

This work addresses a bottleneck in optimization for machine learning by enabling faster convergence in a broader class of min-max problems, though it is incremental as it builds on existing stochastic methods.

The paper tackles the limitation of stochastic primal-dual algorithms that require bilinear structure for faster convergence, proposing new algorithms that achieve a convergence rate of O(1/T) for convex-concave problems without this assumption, such as in robust model learning and AUC maximization.

Previous studies on stochastic primal-dual algorithms for solving min-max problems with faster convergence heavily rely on the bilinear structure of the problem, which restricts their applicability to a narrowed range of problems. The main contribution of this paper is the design and analysis of new stochastic primal-dual algorithms that use a mixture of stochastic gradient updates and a logarithmic number of deterministic dual updates for solving a family of convex-concave problems with no bilinear structure assumed. Faster convergence rates than $O(1/\sqrt{T})$ with $T$ being the number of stochastic gradient updates are established under some mild conditions of involved functions on the primal and the dual variable. For example, for a family of problems that enjoy a weak strong convexity in terms of the primal variable and has a strongly concave function of the dual variable, the convergence rate of the proposed algorithm is $O(1/T)$. We also investigate the effectiveness of the proposed algorithms for learning robust models and empirical AUC maximization.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes