ML OCSep 22, 2014

Parallel and Distributed Block-Coordinate Frank-Wolfe Algorithms

Yu-Xiang Wang, Veeranjaneyulu Sadhanala, Wei Dai, Willie Neiswanger, Suvrit Sra, Eric P. Xing

arXiv:1409.6086v246 citations

Originality Incremental advance

AI Analysis

This work addresses the need for faster optimization in machine learning applications like structural SVM and Group Fused Lasso, but it is incremental as it builds on the existing Block-Coordinate Frank-Wolfe method.

The paper tackles the problem of scaling Frank-Wolfe algorithms for optimization with block-separable constraints by developing parallel and distributed versions that use mini-batching and asynchronous updates to achieve speedups, with experiments on structural SVM and Group Fused Lasso showing significant speedups over state-of-the-art synchronous methods.

We develop parallel and distributed Frank-Wolfe algorithms; the former on shared memory machines with mini-batching, and the latter in a delayed update framework. Whenever possible, we perform computations asynchronously, which helps attain speedups on multicore machines as well as in distributed environments. Moreover, instead of worst-case bounded delays, our methods only depend (mildly) on \emph{expected} delays, allowing them to be robust to stragglers and faulty worker threads. Our algorithms assume block-separable constraints, and subsume the recent Block-Coordinate Frank-Wolfe (BCFW) method~\citep{lacoste2013block}. Our analysis reveals problem-dependent quantities that govern the speedups of our methods over BCFW. We present experiments on structural SVM and Group Fused Lasso, obtaining significant speedups over competing state-of-the-art (and synchronous) methods.

View on arXiv PDF

Similar