Addressing Data Heterogeneity in Decentralized Learning via Topological Pre-processing
This work addresses convergence issues in decentralized learning for distributed systems, offering an incremental improvement through novel graph construction.
The paper tackles the problem of data heterogeneity in decentralized learning by proposing a topological pre-processing method that clusters peers into locally heterogeneous graphs, which improves convergence and maintains privacy. The results show that this approach outperforms homogeneous graphs, scales well with minimal overhead, and remains robust to network partitions.
Recently, local peer topology has been shown to influence the overall convergence of decentralized learning (DL) graphs in the presence of data heterogeneity. In this paper, we demonstrate the advantages of constructing a proxy-based locally heterogeneous DL topology to enhance convergence and maintain data privacy. In particular, we propose a novel peer clumping strategy to efficiently cluster peers before arranging them in a final training graph. By showing how locally heterogeneous graphs outperform locally homogeneous graphs of similar size and from the same global data distribution, we present a strong case for topological pre-processing. Moreover, we demonstrate the scalability of our approach by showing how the proposed topological pre-processing overhead remains small in large graphs while the performance gains get even more pronounced. Furthermore, we show the robustness of our approach in the presence of network partitions.