CROct 3, 2020

Utility-efficient Differentially Private K-means Clustering based on Cluster Merging

arXiv:2010.01234v137 citations
Originality Incremental advance
AI Analysis

This work addresses utility efficiency in differentially private clustering for data analysts, but it is incremental as it builds on existing methods with specific enhancements.

The paper tackled the problem of improving utility in differentially private k-means clustering by proposing DP-KCCM, which uses adaptive noise and cluster merging, resulting in significant utility gains as shown in experiments.

Differential privacy is widely used in data analysis. State-of-the-art $k$-means clustering algorithms with differential privacy typically add an equal amount of noise to centroids for each iterative computation. In this paper, we propose a novel differentially private $k$-means clustering algorithm, DP-KCCM, that significantly improves the utility of clustering by adding adaptive noise and merging clusters. Specifically, to obtain $k$ clusters with differential privacy, the algorithm first generates $n \times k$ initial centroids, adds adaptive noise for each iteration to get $n \times k$ clusters, and finally merges these clusters into $k$ ones. We theoretically prove the differential privacy of the proposed algorithm. Surprisingly, extensive experimental results show that: 1) cluster merging with equal amounts of noise improves the utility somewhat; 2) although adding adaptive noise only does not improve the utility, combining both cluster merging and adaptive noise further improves the utility significantly.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes