Plume: Differential Privacy at Scale
This work solves scalability and usability problems for organizations deploying differential privacy at industrial scale, though it is incremental as it builds on existing literature.
The paper tackles the challenge of implementing differential privacy in practical systems by addressing issues like multiple user contributions, unknown data domains, and scalability, resulting in Plume, a system deployed at Google that processes datasets with trillions of records.
Differential privacy has become the standard for private data analysis, and an extensive literature now offers differentially private solutions to a wide variety of problems. However, translating these solutions into practical systems often requires confronting details that the literature ignores or abstracts away: users may contribute multiple records, the domain of possible records may be unknown, and the eventual system must scale to large volumes of data. Failure to carefully account for all three issues can severely impair a system's quality and usability. We present Plume, a system built to address these problems. We describe a number of sometimes subtle implementation issues and offer practical solutions that, together, make an industrial-scale system for differentially private data analysis possible. Plume is currently deployed at Google and is routinely used to process datasets with trillions of records.