LGMLJun 20, 2018

A Distributed Second-Order Algorithm You Can Trust

arXiv:1806.07569v133 citations
Originality Highly original
AI Analysis

This work addresses communication bottlenecks in distributed machine learning for generalized linear models, offering a practical solution with theoretical guarantees.

The paper tackles the challenge of high communication costs in distributed second-order optimization by introducing an algorithm that computes only diagonal blocks of the Hessian and uses an adaptive trust-region approach to handle approximations, achieving state-of-the-art results on large benchmark datasets.

Due to the rapid growth of data and computational resources, distributed optimization has become an active research area in recent years. While first-order methods seem to dominate the field, second-order methods are nevertheless attractive as they potentially require fewer communication rounds to converge. However, there are significant drawbacks that impede their wide adoption, such as the computation and the communication of a large Hessian matrix. In this paper we present a new algorithm for distributed training of generalized linear models that only requires the computation of diagonal blocks of the Hessian matrix on the individual workers. To deal with this approximate information we propose an adaptive approach that - akin to trust-region methods - dynamically adapts the auxiliary model to compensate for modeling errors. We provide theoretical rates of convergence for a wide class of problems including L1-regularized objectives. We also demonstrate that our approach achieves state-of-the-art results on multiple large benchmark datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes