CR AI LGJan 13, 2022

Privacy Amplification by Subsampling in Time Domain

Tatsuki Koga, Casey Meehan, Kamalika Chaudhuri

arXiv:2201.04762v12.9

Originality Incremental advance

AI Analysis

This work addresses privacy risks in aggregate time-series data for applications like traffic monitoring, offering a method to reduce noise while maintaining privacy, though it is incremental as it builds on existing differential privacy frameworks.

The paper tackles the challenge of applying differential privacy to time-series data where individuals can influence multiple time steps, which traditionally requires excessive noise that obscures trends. It shows that subsampling or filtering in time can significantly reduce sensitivity, enabling more practical privacy mechanisms with demonstrated utility improvements on real-world and synthetic data.

Aggregate time-series data like traffic flow and site occupancy repeatedly sample statistics from a population across time. Such data can be profoundly useful for understanding trends within a given population, but also pose a significant privacy risk, potentially revealing e.g., who spends time where. Producing a private version of a time-series satisfying the standard definition of Differential Privacy (DP) is challenging due to the large influence a single participant can have on the sequence: if an individual can contribute to each time step, the amount of additive noise needed to satisfy privacy increases linearly with the number of time steps sampled. As such, if a signal spans a long duration or is oversampled, an excessive amount of noise must be added, drowning out underlying trends. However, in many applications an individual realistically cannot participate at every time step. When this is the case, we observe that the influence of a single participant (sensitivity) can be reduced by subsampling and/or filtering in time, while still meeting privacy requirements. Using a novel analysis, we show this significant reduction in sensitivity and propose a corresponding class of privacy mechanisms. We demonstrate the utility benefits of these techniques empirically with real-world and synthetic time-series data.

View on arXiv PDF

Similar