LGDCDec 29, 2024

Asynchronous Federated Clustering with Unknown Number of Clusters

arXiv:2412.20341v114 citationsh-index: 12AAAI
Originality Incremental advance
AI Analysis

This addresses privacy-preserving knowledge mining from non-IID data in federated learning, but is incremental as it builds on existing federated clustering methods.

The paper tackles the coupled problems of client communication asynchrony and unknown number of clusters in federated clustering, proposing AFCL which uses seed points and a balancing mechanism to achieve effective clustering results.

Federated Clustering (FC) is crucial to mining knowledge from unlabeled non-Independent Identically Distributed (non-IID) data provided by multiple clients while preserving their privacy. Most existing attempts learn cluster distributions at local clients, and then securely pass the desensitized information to the server for aggregation. However, some tricky but common FC problems are still relatively unexplored, including the heterogeneity in terms of clients' communication capacity and the unknown number of proper clusters $k^*$. To further bridge the gap between FC and real application scenarios, this paper first shows that the clients' communication asynchrony and unknown $k^*$ are complex coupling problems, and then proposes an Asynchronous Federated Cluster Learning (AFCL) method accordingly. It spreads the excessive number of seed points to the clients as a learning medium and coordinates them across the clients to form a consensus. To alleviate the distribution imbalance cumulated due to the unforeseen asynchronous uploading from the heterogeneous clients, we also design a balancing mechanism for seeds updating. As a result, the seeds gradually adapt to each other to reveal a proper number of clusters. Extensive experiments demonstrate the efficacy of AFCL.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes