Federated Learning Clients Clustering with Adaptation to Data Drifts
This addresses performance degradation due to data drift in federated learning for edge devices, offering an incremental improvement over existing clustered FL methods.
The paper tackles the problem of data drift in federated learning, which degrades cluster homogeneity and model performance, by proposing FIELDING, a framework that detects drift and performs selective re-clustering. Experiments show it improves final model accuracy by 1.9-5.9% and achieves target accuracy 1.16x-2.23x faster than state-of-the-art methods.
Federated Learning (FL) trains deep models across edge devices without centralizing raw data, preserving user privacy. However, client heterogeneity slows down convergence and limits global model accuracy. Clustered FL (CFL) mitigates this by grouping clients with similar representations and training a separate model for each cluster. In practice, client data evolves over time, a phenomenon we refer to as data drift, which breaks cluster homogeneity and degrades performance. Data drift can take different forms depending on whether changes occur in the output values, the input features, or the relationship between them. We propose FIELDING, a CFL framework for handling diverse types of data drift with low overhead. FIELDING detects drift at individual clients and performs selective re-clustering to balance cluster quality and model performance, while remaining robust to malicious clients and varying levels of heterogeneity. Experiments show that FIELDING improves final model accuracy by 1.9-5.9% and achieves target accuracy 1.16x-2.23x faster than existing state-of-the-art CFL methods.