OC LGNov 2, 2021

Coordinate Linear Variance Reduction for Generalized Linear Programming

Chaobing Song, Cheuk Yin Lin, Stephen J. Wright, Jelena Diakonikolas

arXiv:2111.01842v412.015 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses optimization efficiency for machine learning and operations research problems, offering incremental improvements in algorithm design for specific constraints.

The authors tackled large-scale generalized linear programs (GLP) by developing Coordinate Linear Variance Reduction (CLVR), an efficient first-order algorithm that improves complexity bounds to depend on max row norm rather than spectral norm, and scales with nonzero elements for separable cases, with numerical experiments verifying practical effectiveness in wall-clock time and data passes.

We study a class of generalized linear programs (GLP) in a large-scale setting, which includes simple, possibly nonsmooth convex regularizer and simple convex set constraints. By reformulating (GLP) as an equivalent convex-concave min-max problem, we show that the linear structure in the problem can be used to design an efficient, scalable first-order algorithm, to which we give the name \emph{Coordinate Linear Variance Reduction} (\textsc{clvr}; pronounced "clever"). \textsc{clvr} yields improved complexity results for (GLP) that depend on the max row norm of the linear constraint matrix in (GLP) rather than the spectral norm. When the regularization terms and constraints are separable, \textsc{clvr} admits an efficient lazy update strategy that makes its complexity bounds scale with the number of nonzero elements of the linear constraint matrix in (GLP) rather than the matrix dimensions. On the other hand, for the special case of linear programs, by exploiting sharpness, we propose a restart scheme for \textsc{clvr} to obtain empirical linear convergence. Then we show that Distributionally Robust Optimization (DRO) problems with ambiguity sets based on both $f$-divergence and Wasserstein metrics can be reformulated as (GLPs) by introducing sparsely connected auxiliary variables. We complement our theoretical guarantees with numerical experiments that verify our algorithm's practical effectiveness, in terms of wall-clock time and number of data passes.

View on arXiv PDF Code

Similar