On the Query Complexity of Training Data Reconstruction in Private Learning
This addresses security concerns for private machine learning by quantifying reconstruction risks, offering foundational insights for privacy-preserving algorithms.
The paper tackles the problem of determining how many queries an adversary needs to reconstruct training data from private learners, providing the first known lower bounds on query complexity for various privacy settings, with results that are minimax optimal for all parameters.
We analyze the number of queries that a whitebox adversary needs to make to a private learner in order to reconstruct its training data. For $(ε, δ)$ DP learners with training data drawn from any arbitrary compact metric space, we provide the \emph{first known lower bounds on the adversary's query complexity} as a function of the learner's privacy parameters. \emph{Our results are minimax optimal for every $ε\geq 0, δ\in [0, 1]$, covering both $ε$-DP and $(0, δ)$ DP as corollaries}. Beyond this, we obtain query complexity lower bounds for $(α, ε)$ Rényi DP learners that are valid for any $α> 1, ε\geq 0$. Finally, we analyze data reconstruction attacks on locally compact metric spaces via the framework of Metric DP, a generalization of DP that accounts for the underlying metric structure of the data. In this setting, we provide the first known analysis of data reconstruction in unbounded, high dimensional spaces and obtain query complexity lower bounds that are nearly tight modulo logarithmic factors.