FastLloyd: Federated, Accurate, Secure, and Tunable $k$-Means Clustering with Differential Privacy
This addresses the need for efficient and private clustering in distributed data environments, offering a significant performance improvement over existing methods.
The paper tackles the problem of privacy-preserving k-means clustering in federated settings by developing FastLloyd, which combines computational differential privacy and secure aggregation to achieve a five orders of magnitude speed-up over prior work while improving utility.
We study the problem of privacy-preserving $k$-means clustering in the horizontally federated setting. Existing federated approaches using secure computation suffer from substantial overheads and do not offer output privacy. At the same time, differentially private (DP) $k$-means algorithms either assume a trusted central curator or significantly degrade utility by adding noise in the local DP model. Naively combining the secure and central DP solutions results in a protocol with impractical overhead. Instead, our work provides enhancements to both the DP and secure computation components, resulting in a design that is faster, more private, and more accurate than previous work. By utilizing the computational DP model, we design a lightweight, secure aggregation-based approach that achieves five orders of magnitude speed-up over state-of-the-art related work. Furthermore, we not only maintain the utility of the state-of-the-art in the central model of DP, but we improve the utility further by designing a new DP clustering mechanism.