LGAIDCApr 15, 2021

D-Cliques: Compensating for Data Heterogeneity with Topology in Decentralized Federated Learning

arXiv:2104.07365v435 citations
Originality Incremental advance
AI Analysis

This addresses data heterogeneity issues in decentralized federated learning, offering a practical solution for distributed machine learning systems, though it is incremental as it builds on existing decentralized SGD methods.

The paper tackles the problem of slow convergence in decentralized federated learning due to label distribution skew by designing a novel communication topology called D-Cliques, which groups nodes into sparsely interconnected cliques to reduce gradient bias, achieving similar convergence speed to a fully-connected topology with 98% fewer edges and 96% fewer messages in a 1000-node setup.

The convergence speed of machine learning models trained with Federated Learning is significantly affected by heterogeneous data partitions, even more so in a fully decentralized setting without a central server. In this paper, we show that the impact of label distribution skew, an important type of data heterogeneity, can be significantly reduced by carefully designing the underlying communication topology. We present D-Cliques, a novel topology that reduces gradient bias by grouping nodes in sparsely interconnected cliques such that the label distribution in a clique is representative of the global label distribution. We also show how to adapt the updates of decentralized SGD to obtain unbiased gradients and implement an effective momentum with D-Cliques. Our extensive empirical evaluation on MNIST and CIFAR10 demonstrates that our approach provides similar convergence speed as a fully-connected topology, which provides the best convergence in a data heterogeneous setting, with a significant reduction in the number of edges and messages. In a 1000-node topology, D-Cliques require 98% less edges and 96% less total messages, with further possible gains using a small-world topology across cliques.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes