LGNov 26, 2021

Non-IID data and Continual Learning processes in Federated Learning: A long road ahead

arXiv:2111.13394v1120 citations
Originality Synthesis-oriented
AI Analysis

This is an incremental review that addresses data heterogeneity issues for researchers and practitioners in Federated Learning.

The paper tackles the problem of data statistical heterogeneity in Federated Learning, which can hinder convergence, by formally classifying heterogeneity and reviewing existing strategies while suggesting adaptations from Continual Learning.

Federated Learning is a novel framework that allows multiple devices or institutions to train a machine learning model collaboratively while preserving their data private. This decentralized approach is prone to suffer the consequences of data statistical heterogeneity, both across the different entities and over time, which may lead to a lack of convergence. To avoid such issues, different methods have been proposed in the past few years. However, data may be heterogeneous in lots of different ways, and current proposals do not always determine the kind of heterogeneity they are considering. In this work, we formally classify data statistical heterogeneity and review the most remarkable learning strategies that are able to face it. At the same time, we introduce approaches from other machine learning frameworks, such as Continual Learning, that also deal with data heterogeneity and could be easily adapted to the Federated Learning settings.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes