OCLGNAFeb 26, 2021

Variance Reduction via Primal-Dual Accelerated Dual Averaging for Nonsmooth Convex Finite-Sums

arXiv:2102.13643v222 citations
AI Analysis

This work addresses optimization efficiency for machine learning applications like support vector machines, offering incremental improvements in computational complexity for nonsmooth convex problems.

The paper tackles structured nonsmooth convex finite-sum optimization problems, such as support vector machines and least absolute deviation, by proposing a new algorithm called Variance Reduction via Primal-Dual Accelerated Dual Averaging (VRPDA). It achieves improved complexity results, e.g., O(nd log min{1/ε, n} + d/ε) for general convex settings, outperforming state-of-the-art estimates.

We study structured nonsmooth convex finite-sum optimization that appears widely in machine learning applications, including support vector machines and least absolute deviation. For the primal-dual formulation of this problem, we propose a novel algorithm called \emph{Variance Reduction via Primal-Dual Accelerated Dual Averaging (\vrpda)}. In the nonsmooth and general convex setting, \vrpda~has the overall complexity $O(nd\log\min \{1/ε, n\} + d/ε)$ in terms of the primal-dual gap, where $n$ denotes the number of samples, $d$ the dimension of the primal variables, and $ε$ the desired accuracy. In the nonsmooth and strongly convex setting, the overall complexity of \vrpda~becomes $O(nd\log\min\{1/ε, n\} + d/\sqrtε)$ in terms of both the primal-dual gap and the distance between iterate and optimal solution. Both these results for \vrpda~improve significantly on state-of-the-art complexity estimates, which are $O(nd\log \min\{1/ε, n\} + \sqrt{n}d/ε)$ for the nonsmooth and general convex setting and $O(nd\log \min\{1/ε, n\} + \sqrt{n}d/\sqrtε)$ for the nonsmooth and strongly convex setting, in a much more simple and straightforward way. Moreover, both complexities are better than \emph{lower} bounds for general convex finite sums that lack the particular (common) structure that we consider. Our theoretical results are supported by numerical experiments, which confirm the competitive performance of \vrpda~compared to state-of-the-art.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes