LG OC MLOct 26, 2020

Stochastic Optimization with Laggard Data Pipelines

Naman Agarwal, Rohan Anil, Tomer Koren, Kunal Talwar, Cyril Zhang

arXiv:2010.13639v15.812 citations

Originality Incremental advance

AI Analysis

This addresses efficiency issues in large-scale machine learning optimization for practitioners dealing with laggard data pipelines, though it is incremental as it builds on existing data echoing methods.

The paper tackled the problem of performance bottlenecks in massively parallel optimization pipelines by analyzing data echoing, showing that it provides provable speedups on the curvature-dominated part of convergence while maintaining optimal statistical rates.

State-of-the-art optimization is steadily shifting towards massively parallel pipelines with extremely large batch sizes. As a consequence, CPU-bound preprocessing and disk/memory/network operations have emerged as new performance bottlenecks, as opposed to hardware-accelerated gradient computations. In this regime, a recently proposed approach is data echoing (Choi et al., 2019), which takes repeated gradient steps on the same batch while waiting for fresh data to arrive from upstream. We provide the first convergence analyses of "data-echoed" extensions of common optimization methods, showing that they exhibit provable improvements over their synchronous counterparts. Specifically, we show that in convex optimization with stochastic minibatches, data echoing affords speedups on the curvature-dominated part of the convergence rate, while maintaining the optimal statistical rate.

View on arXiv PDF

Similar