LGDCMLSep 29, 2020

A Low Complexity Decentralized Neural Net with Centralized Equivalence using Layer-wise Learning

arXiv:2009.13982v1
Originality Incremental advance
AI Analysis

This work addresses privacy and efficiency challenges in distributed machine learning, though it appears incremental as it builds on existing ADMM and layerwise optimization methods.

The paper tackles the problem of training large neural networks in a decentralized setting with privacy constraints, achieving equivalent performance to centralized training while reducing computational and communication costs.

We design a low complexity decentralized learning algorithm to train a recently proposed large neural network in distributed processing nodes (workers). We assume the communication network between the workers is synchronized and can be modeled as a doubly-stochastic mixing matrix without having any master node. In our setup, the training data is distributed among the workers but is not shared in the training process due to privacy and security concerns. Using alternating-direction-method-of-multipliers (ADMM) along with a layerwise convex optimization approach, we propose a decentralized learning algorithm which enjoys low computational complexity and communication cost among the workers. We show that it is possible to achieve equivalent learning performance as if the data is available in a single place. Finally, we experimentally illustrate the time complexity and convergence behavior of the algorithm.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes