Asynchronous Stochastic Block Coordinate Descent with Variance Reduction
This work addresses scalability issues in large-scale optimization for machine learning practitioners, offering incremental improvements in asynchronous parallel methods.
The paper tackles the problem of optimizing composite objective functions common in machine learning and computer vision by proposing an asynchronous stochastic block coordinate descent algorithm with variance reduction (AsySBCDVR), achieving linear convergence for strongly convex functions and sublinear for general convex functions, with near-linear speedup on shared-memory systems.
Asynchronous parallel implementations for stochastic optimization have received huge successes in theory and practice recently. Asynchronous implementations with lock-free are more efficient than the one with writing or reading lock. In this paper, we focus on a composite objective function consisting of a smooth convex function $f$ and a block separable convex function, which widely exists in machine learning and computer vision. We propose an asynchronous stochastic block coordinate descent algorithm with the accelerated technology of variance reduction (AsySBCDVR), which are with lock-free in the implementation and analysis. AsySBCDVR is particularly important because it can scale well with the sample size and dimension simultaneously. We prove that AsySBCDVR achieves a linear convergence rate when the function $f$ is with the optimal strong convexity property, and a sublinear rate when $f$ is with the general convexity. More importantly, a near-linear speedup on a parallel system with shared memory can be obtained.