LGOct 23, 2023

Dynamically Weighted Federated k-Means

arXiv:2310.14858v24 citationsh-index: 4
Originality Incremental advance
AI Analysis

This addresses privacy-preserving collaborative clustering for multiple data owners, but it is incremental as it builds on federated learning and k-means.

The paper tackles the problem of federated clustering with heterogeneous data by introducing Dynamically Weighted Federated k-means, which matches centralized k-means performance and outperforms existing federated methods like k-FED in realistic scenarios.

Federated clustering, an integral aspect of federated machine learning, enables multiple data sources to collaboratively cluster their data, maintaining decentralization and preserving privacy. In this paper, we introduce a novel federated clustering algorithm named Dynamically Weighted Federated k-means (DWF k-means) based on Lloyd's method for k-means clustering, to address the challenges associated with distributed data sources and heterogeneous data. Our proposed algorithm combines the benefits of traditional clustering techniques with the privacy and scalability benefits offered by federated learning. The algorithm facilitates collaborative clustering among multiple data owners, allowing them to cluster their local data collectively while exchanging minimal information with the central coordinator. The algorithm optimizes the clustering process by adaptively aggregating cluster assignments and centroids from each data source, thereby learning a global clustering solution that reflects the collective knowledge of the entire federated network. We address the issue of empty clusters, which commonly arises in the context of federated clustering. We conduct experiments on multiple datasets and data distribution settings to evaluate the performance of our algorithm in terms of clustering score, accuracy, and v-measure. The results demonstrate that our approach can match the performance of the centralized classical k-means baseline, and outperform existing federated clustering methods like k-FED in realistic scenarios.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes