LGDCFeb 17, 2023

Clustered Data Sharing for Non-IID Federated Learning over Wireless Networks

arXiv:2302.10747v27 citationsh-index: 73
Originality Incremental advance
AI Analysis

This addresses the challenge of statistical imbalances in federated learning for IoT applications, but it is incremental as it builds on existing FL methods with a novel clustering approach.

The paper tackles the problem of non-IID data in federated learning, which causes high communication costs and accuracy declines, by proposing a clustered data sharing framework that improves convergence and model accuracy in limited communication environments.

Federated Learning (FL) is a novel distributed machine learning approach to leverage data from Internet of Things (IoT) devices while maintaining data privacy. However, the current FL algorithms face the challenges of non-independent and identically distributed (non-IID) data, which causes high communication costs and model accuracy declines. To address the statistical imbalances in FL, we propose a clustered data sharing framework which spares the partial data from cluster heads to credible associates through device-to-device (D2D) communication. Moreover, aiming at diluting the data skew on nodes, we formulate the joint clustering and data sharing problem based on the privacy-preserving constrained graph. To tackle the serious coupling of decisions on the graph, we devise a distribution-based adaptive clustering algorithm (DACA) basing on three deductive cluster-forming conditions, which ensures the maximum yield of data sharing. The experiments show that the proposed framework facilitates FL on non-IID datasets with better convergence and model accuracy under a limited communication environment.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes