LG CR ITMay 31, 2022

Secure Federated Clustering

Songze Li, Sizai Hou, Baturalp Buyukates, Salman Avestimehr

arXiv:2205.15564v113.015 citationsh-index: 53

Originality Highly original

AI Analysis

This addresses the need for secure and efficient unsupervised learning in distributed systems, offering a novel solution for privacy-preserving clustering in federated environments.

The paper tackles the problem of performing k-means clustering in a federated learning setting while ensuring data privacy, developing SecFC to achieve no performance loss compared to centralized clustering and protect client data and cluster centers from leakage. Experimental results show universally superior performance across different data distributions and computational practicality.

We consider a foundational unsupervised learning task of $k$-means data clustering, in a federated learning (FL) setting consisting of a central server and many distributed clients. We develop SecFC, which is a secure federated clustering algorithm that simultaneously achieves 1) universal performance: no performance loss compared with clustering over centralized data, regardless of data distribution across clients; 2) data privacy: each client's private data and the cluster centers are not leaked to other clients and the server. In SecFC, the clients perform Lagrange encoding on their local data and share the coded data in an information-theoretically private manner; then leveraging the algebraic structure of the coding, the FL network exactly executes the Lloyd's $k$-means heuristic over the coded data to obtain the final clustering. Experiment results on synthetic and real datasets demonstrate the universally superior performance of SecFC for different data distributions across clients, and its computational practicality for various combinations of system parameters. Finally, we propose an extension of SecFC to further provide membership privacy for all data points.

View on arXiv PDF

Similar