LGFeb 12, 2015

Scalable Stochastic Alternating Direction Method of Multipliers

arXiv:1502.03529v39 citations
AI Analysis

This addresses scalability in large-scale optimization for machine learning, offering a novel method that improves both computation and storage efficiency compared to existing approaches.

The paper tackles the scalability issues of stochastic ADMM methods by proposing SCAS-ADMM, which achieves a convergence rate of O(1/T) on general convex problems without storing historic gradients, and experiments show it achieves state-of-the-art performance on graph-guided fused lasso.

Stochastic alternating direction method of multipliers (ADMM), which visits only one sample or a mini-batch of samples each time, has recently been proved to achieve better performance than batch ADMM. However, most stochastic methods can only achieve a convergence rate $O(1/\sqrt T)$ on general convex problems,where T is the number of iterations. Hence, these methods are not scalable with respect to convergence rate (computation cost). There exists only one stochastic method, called SA-ADMM, which can achieve convergence rate $O(1/T)$ on general convex problems. However, an extra memory is needed for SA-ADMM to store the historic gradients on all samples, and thus it is not scalable with respect to storage cost. In this paper, we propose a novel method, called scalable stochastic ADMM(SCAS-ADMM), for large-scale optimization and learning problems. Without the need to store the historic gradients, SCAS-ADMM can achieve the same convergence rate $O(1/T)$ as the best stochastic method SA-ADMM and batch ADMM on general convex problems. Experiments on graph-guided fused lasso show that SCAS-ADMM can achieve state-of-the-art performance in real applications

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes