Stochastic Clustered Federated Learning
This addresses the challenge of Non-IID data in federated learning systems, which is a common issue in real-world edge device deployments, and is incremental as it builds on existing clustered federated learning methods.
The paper tackles the problem of performance degradation in federated learning due to Non-IID data by proposing StoCFL, a clustered federated learning approach that groups clients with similar data distributions, resulting in improved model performance across various Non-IID settings and a real-world dataset.
Federated learning is a distributed learning framework that takes full advantage of private data samples kept on edge devices. In real-world federated learning systems, these data samples are often decentralized and Non-Independently Identically Distributed (Non-IID), causing divergence and performance degradation in the federated learning process. As a new solution, clustered federated learning groups federated clients with similar data distributions to impair the Non-IID effects and train a better model for every cluster. This paper proposes StoCFL, a novel clustered federated learning approach for generic Non-IID issues. In detail, StoCFL implements a flexible CFL framework that supports an arbitrary proportion of client participation and newly joined clients for a varying FL system, while maintaining a great improvement in model performance. The intensive experiments are conducted by using four basic Non-IID settings and a real-world dataset. The results show that StoCFL could obtain promising cluster results even when the number of clusters is unknown. Based on the client clustering results, models trained with StoCFL outperform baseline approaches in a variety of contexts.