MLCGCRLGATMay 5, 2023

Differentially Private Topological Data Analysis

arXiv:2305.03609v24 citations
Originality Highly original
AI Analysis

It addresses privacy concerns in topological data analysis for applications like human movement tracking, representing a novel method for a known bottleneck.

This paper tackles the challenge of performing differentially private topological data analysis by analyzing the sensitivity of persistence diagrams and proposing a mechanism using the $L^1$-distance to measure, achieving near-optimal privacy error bounds as demonstrated in simulations and real-world data.

This paper is the first to attempt differentially private (DP) topological data analysis (TDA), producing near-optimal private persistence diagrams. We analyze the sensitivity of persistence diagrams in terms of the bottleneck distance, and we show that the commonly used Čech complex has sensitivity that does not decrease as the sample size $n$ increases. This makes it challenging for the persistence diagrams of Čech complexes to be privatized. As an alternative, we show that the persistence diagram obtained by the $L^1$-distance to measure (DTM) has sensitivity $O(1/n)$. Based on the sensitivity analysis, we propose using the exponential mechanism whose utility function is defined in terms of the bottleneck distance of the $L^1$-DTM persistence diagrams. We also derive upper and lower bounds of the accuracy of our privacy mechanism; the obtained bounds indicate that the privacy error of our mechanism is near-optimal. We demonstrate the performance of our privatized persistence diagrams through simulations as well as on a real dataset tracking human movement.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes