LGJan 1, 2024

A review on different techniques used to combat the non-IID and heterogeneous nature of data in FL

arXiv:2401.00809v17.915 citationsh-index: 1
Originality Synthesis-oriented
AI Analysis

It addresses data heterogeneity problems in FL for privacy-sensitive industries like healthcare and finance, but is incremental as it reviews existing techniques.

The paper reviews challenges in Federated Learning (FL) due to non-IID and heterogeneous data distributions across devices, exploring existing algorithms to address these issues without presenting new results or numbers.

Federated Learning (FL) is a machine-learning approach enabling collaborative model training across multiple decentralized edge devices that hold local data samples, all without exchanging these samples. This collaborative process occurs under the supervision of a central server orchestrating the training or via a peer-to-peer network. The significance of FL is particularly pronounced in industries such as healthcare and finance, where data privacy holds paramount importance. However, training a model under the Federated learning setting brings forth several challenges, with one of the most prominent being the heterogeneity of data distribution among the edge devices. The data is typically non-independently and non-identically distributed (non-IID), thereby presenting challenges to model convergence. This report delves into the issues arising from non-IID and heterogeneous data and explores current algorithms designed to address these challenges.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes