DS CR LGAug 13, 2024

Faster Private Minimum Spanning Trees

arXiv:2408.06997v12.31 citationsh-index: 3

Originality Incremental advance

AI Analysis

This addresses the need for faster private MST algorithms in applications like clustering and synthetic data generation, representing an incremental advance over prior methods.

The paper tackles the problem of releasing a minimum spanning tree under edge-weight differential privacy constraints, achieving a new algorithm that matches the utility of existing in-place methods while running in time O(m + n^{3/2} log n) for dense graphs, with experimental evaluations supporting improvements in utility or running time.

Motivated by applications in clustering and synthetic data generation, we consider the problem of releasing a minimum spanning tree (MST) under edge-weight differential privacy constraints where a graph topology $G=(V,E)$ with $n$ vertices and $m$ edges is public, the weight matrix $\vec{W}\in \mathbb{R}^{n \times n}$ is private, and we wish to release an approximate MST under $ρ$-zero-concentrated differential privacy. Weight matrices are considered neighboring if they differ by at most $Δ_\infty$ in each entry, i.e., we consider an $\ell_\infty$ neighboring relationship. Existing private MST algorithms either add noise to each entry in $\vec{W}$ and estimate the MST by post-processing or add noise to weights in-place during the execution of a specific MST algorithm. Using the post-processing approach with an efficient MST algorithm takes $O(n^2)$ time on dense graphs but results in an additive error on the weight of the MST of magnitude $O(n^2\log n)$. In-place algorithms give asymptotically better utility, but the running time of existing in-place algorithms is $O(n^3)$ for dense graphs. Our main result is a new differentially private MST algorithm that matches the utility of existing in-place methods while running in time $O(m + n^{3/2}\log n)$ for fixed privacy parameter $ρ$. The technical core of our algorithm is an efficient sublinear time simulation of Report-Noisy-Max that works by discretizing all edge weights to a multiple of $Δ_\infty$ and forming groups of edges with identical weights. Specifically, we present a data structure that allows us to sample a noisy minimum weight edge among at most $O(n^2)$ cut edges in $O(\sqrt{n} \log n)$ time. Experimental evaluations support our claims that our algorithm significantly improves previous algorithms either in utility or running time.

View on arXiv PDF

Similar