FedDAG: Clustered Federated Learning via Global Data and Gradient Integration for Heterogeneous Environments
This paper offers an incremental improvement for researchers and practitioners working with federated learning in heterogeneous data environments.
This paper addresses performance degradation in Federated Learning (FL) due to heterogeneous client data by introducing FedDAG, a clustered FL framework. FedDAG integrates both data and gradient information for a more holistic client similarity assessment and uses a dual-encoder architecture to enable cross-cluster feature transfer, consistently outperforming state-of-the-art clustered FL baselines in accuracy.
Federated Learning (FL) enables a group of clients to collaboratively train a model without sharing individual data, but its performance drops when client data are heterogeneous. Clustered FL tackles this by grouping similar clients. However, existing clustered FL approaches rely solely on either data similarity or gradient similarity; however, this results in an incomplete assessment of client similarities. Prior clustered FL approaches also restrict knowledge and representation sharing to clients within the same cluster. This prevents cluster models from benefiting from the diverse client population across clusters. To address these limitations, FedDAG introduces a clustered FL framework, FedDAG, that employs a weighted, class-wise similarity metric that integrates both data and gradient information, providing a more holistic measure of similarity during clustering. In addition, FedDAG adopts a dual-encoder architecture for cluster models, comprising a primary encoder trained on its own clients' data and a secondary encoder refined using gradients from complementary clusters. This enables cross-cluster feature transfer while preserving cluster-specific specialization. Experiments on diverse benchmarks and data heterogeneity settings show that FedDAG consistently outperforms state-of-the-art clustered FL baselines in accuracy.