Toward Data Heterogeneity of Federated Learning
This work addresses data heterogeneity in federated learning, a common real-world issue, but it appears incremental as it builds on existing algorithms.
The paper tackles the problem of data heterogeneity in federated learning by analyzing how data skew and quantity skew affect state-of-the-art algorithms, and proposes FedMix, a new algorithm that adjusts existing methods, showing that client-side tweaks are more effective than server-side ones.
Federated learning is a popular paradigm for machine learning. Ideally, federated learning works best when all clients share a similar data distribution. However, it is not always the case in the real world. Therefore, the topic of federated learning on heterogeneous data has gained more and more effort from both academia and industry. In this project, we first do extensive experiments to show how data skew and quantity skew will affect the performance of state-of-art federated learning algorithms. Then we propose a new algorithm FedMix which adjusts existing federated learning algorithms and we show its performance. We find that existing state-of-art algorithms such as FedProx and FedNova do not have a significant improvement in all testing cases. But by testing the existing and new algorithms, it seems that tweaking the client side is more effective than tweaking the server side.