LGDCDSOCMLFeb 17, 2019

Quantized Frank-Wolfe: Faster Optimization, Lower Communication, and Projection Free

arXiv:1902.06332v39 citations
AI Analysis

This addresses the bottleneck of communication costs in scalable machine learning training for constrained optimization problems, offering a novel solution with potential broad impact.

The paper tackles the problem of reducing gradient communication overhead in distributed constrained optimization by proposing Quantized-Frank-Wolfe (QFW), a projection-free and communication-efficient algorithm, achieving strong theoretical convergence rates and empirical validation against baselines.

How can we efficiently mitigate the overhead of gradient communications in distributed optimization? This problem is at the heart of training scalable machine learning models and has been mainly studied in the unconstrained setting. In this paper, we propose Quantized-Frank-Wolfe (QFW), the first projection-free and communication-efficient algorithm for solving constrained optimization problems at scale. We consider both convex and non-convex objective functions, expressed as a finite-sum or more generally a stochastic optimization problem, and provide strong theoretical guarantees on the convergence rate of QFW. This is accomplished by proposing novel quantization schemes that efficiently compress gradients while controlling the noise variance introduced during this process. Finally, we empirically validate the efficiency of QFW in terms of communication and the quality of returned solution against natural baselines.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes