FedImpro: Measuring and Improving Client Update in Federated Learning
This work addresses data heterogeneity in federated learning, offering a novel method to improve model performance, though it appears incremental as it builds on existing gradient manipulation approaches.
The paper tackles client drift in federated learning caused by heterogeneous data by proposing FedImpro, which improves local models by decoupling and training on reconstructed feature distributions, resulting in enhanced generalization and reduced gradient dissimilarity.
Federated Learning (FL) models often experience client drift caused by heterogeneous data, where the distribution of data differs across clients. To address this issue, advanced research primarily focuses on manipulating the existing gradients to achieve more consistent client models. In this paper, we present an alternative perspective on client drift and aim to mitigate it by generating improved local models. First, we analyze the generalization contribution of local training and conclude that this generalization contribution is bounded by the conditional Wasserstein distance between the data distribution of different clients. Then, we propose FedImpro, to construct similar conditional distributions for local training. Specifically, FedImpro decouples the model into high-level and low-level components, and trains the high-level portion on reconstructed feature distributions. This approach enhances the generalization contribution and reduces the dissimilarity of gradients in FL. Experimental results show that FedImpro can help FL defend against data heterogeneity and enhance the generalization performance of the model.