LGMLOct 26, 2018

Efficient Distributed Hessian Free Algorithm for Large-scale Empirical Risk Minimization via Accumulating Sample Strategy

arXiv:1810.11507v220 citations
Originality Incremental advance
AI Analysis

This work addresses efficient distributed optimization for large-scale machine learning tasks, offering incremental improvements in computational and communication efficiency.

The paper tackles large-scale empirical risk minimization by proposing a distributed multistage algorithm (DANCE) that gradually increases sample size to efficiently achieve statistical accuracy with fewer data passes, and it outperforms comparable methods in learning problems including neural networks.

In this paper, we propose a Distributed Accumulated Newton Conjugate gradiEnt (DANCE) method in which sample size is gradually increasing to quickly obtain a solution whose empirical loss is under satisfactory statistical accuracy. Our proposed method is multistage in which the solution of a stage serves as a warm start for the next stage which contains more samples (including the samples in the previous stage). The proposed multistage algorithm reduces the number of passes over data to achieve the statistical accuracy of the full training set. Moreover, our algorithm in nature is easy to be distributed and shares the strong scaling property indicating that acceleration is always expected by using more computing nodes. Various iteration complexity results regarding descent direction computation, communication efficiency and stopping criteria are analyzed under convex setting. Our numerical results illustrate that the proposed method outperforms other comparable methods for solving learning problems including neural networks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes