LG DCJun 12, 2021

Federated Learning on Non-IID Data: A Survey

Hangyu Zhu, Jinjin Xu, Shiqing Liu, Yaochu Jin

arXiv:2106.06843v140.41311 citations

Originality Synthesis-oriented

AI Analysis

It tackles performance issues in federated learning for privacy-preserving distributed ML, but is incremental as a survey.

This survey analyzes how non-IID data degrades performance in federated learning, reviewing current methods to address this challenge and discussing their pros and cons.

Federated learning is an emerging distributed machine learning framework for privacy preservation. However, models trained in federated learning usually have worse performance than those trained in the standard centralized learning mode, especially when the training data are not independent and identically distributed (Non-IID) on the local devices. In this survey, we pro-vide a detailed analysis of the influence of Non-IID data on both parametric and non-parametric machine learning models in both horizontal and vertical federated learning. In addition, cur-rent research work on handling challenges of Non-IID data in federated learning are reviewed, and both advantages and disadvantages of these approaches are discussed. Finally, we suggest several future research directions before concluding the paper.

View on arXiv PDF

Similar