DCLGMLFeb 10, 2020

Towards Crowdsourced Training of Large Neural Networks using Decentralized Mixture-of-Experts

arXiv:2002.04013v365 citations
AI Analysis

This addresses the issue of high costs limiting researcher access to state-of-the-art model training, potentially enabling broader participation in AI development.

The paper tackles the problem of prohibitively expensive training of large neural networks by proposing Learning@home, a novel paradigm for crowdsourced training using decentralized mixture-of-experts to handle poorly connected participants, analyzing its performance and comparing it to existing methods.

Many recent breakthroughs in deep learning were achieved by training increasingly larger models on massive datasets. However, training such models can be prohibitively expensive. For instance, the cluster used to train GPT-3 costs over \$250 million. As a result, most researchers cannot afford to train state of the art models and contribute to their development. Hypothetically, a researcher could crowdsource the training of large neural networks with thousands of regular PCs provided by volunteers. The raw computing power of a hundred thousand \$2500 desktops dwarfs that of a \$250M server pod, but one cannot utilize that power efficiently with conventional distributed training methods. In this work, we propose Learning@home: a novel neural network training paradigm designed to handle large amounts of poorly connected participants. We analyze the performance, reliability, and architectural constraints of this paradigm and compare it against existing distributed training techniques.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes