Differentially Private Algorithms for Clustering with Stability Assumptions
This work addresses privacy-preserving clustering for data with stability assumptions, offering incremental improvements in algorithm simplicity and utility metrics.
The paper tackles the problem of differentially private clustering for stable inputs by introducing a simpler algorithm that improves upon prior works, achieving utility in both Wasserstein distance and k-means cost, with extensions to k-median and local differential privacy.
We study the problem of differentially private clustering under input-stability assumptions. Despite the ever-growing volume of works on differential privacy in general and differentially private clustering in particular, only three works (Nissim et al. 2007, Wang et al. 2015, Huang et al. 2018) looked at the problem of privately clustering "nice" k-means instances, all three relying on the sample-and-aggregate framework and all three measuring utility in terms of Wasserstein distance between the true cluster centers and the centers returned by the private algorithm. In this work we improve upon this line of works on multiple axes. We present a far simpler algorithm for clustering stable inputs (not relying on the sample-and-aggregate framework), and analyze its utility in both the Wasserstein distance and the k-means cost. Moreover, our algorithm has straight-forward analogues for "nice" k-median instances and for the local-model of differential privacy.