LGAug 20, 2024

Federated Clustering: An Unsupervised Cluster-Wise Training for Decentralized Data Distributions

arXiv:2408.10664v32 citationsh-index: 13
Originality Incremental advance
AI Analysis

This addresses the challenge of unsupervised learning in federated settings for sensitive applications, offering a scalable solution for edge devices, though it is incremental in extending FL to clustering.

The paper tackles the problem of unsupervised federated clustering for decentralized data distributions, introducing FedCRef to identify all underlying data distributions across clients without labels, achieving up to 95% average local accuracy on public datasets.

Federated Learning (FL) enables decentralized machine learning while preserving data privacy, making it ideal for sensitive applications where data cannot be shared. While FL has been widely studied in supervised contexts, its application to unsupervised learning remains underdeveloped. This work introduces FedCRef, a novel unsupervised federated learning method designed to uncover all underlying data distributions across decentralized clients without requiring labels. This task, known as Federated Clustering, presents challenges due to heterogeneous, non-uniform data distributions and the lack of centralized coordination. Unlike previous methods that assume a one-cluster-per-client setup or require prior knowledge of the number of clusters, FedCRef generalizes to multi-cluster-per-client scenarios. Clients iteratively refine their data partitions while discovering all distinct distributions in the system. The process combines local clustering, model exchange and evaluation via reconstruction error analysis, and collaborative refinement within federated groups of similar distributions to enhance clustering accuracy. Extensive evaluations on four public datasets (EMNIST, KMNIST, Fashion-MNIST and KMNIST49) show that FedCRef successfully identifies true global data distributions, achieving an average local accuracy of up to 95%. The method is also robust to noisy conditions, scalable, and lightweight, making it suitable for resource-constrained edge devices.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes