Locally Private Hypothesis Testing
This work addresses privacy-preserving statistical testing for users in scenarios requiring local differential privacy, representing an incremental advancement in the field.
The paper tackles the problem of differentially private hypothesis testing in the local model, analyzing both symmetric and non-symmetric mechanisms, and provides sample complexity bounds for identity and independence testing, with non-symmetric mechanisms showing better sample complexity than symmetric ones.
We initiate the study of differentially private hypothesis testing in the local-model, under both the standard (symmetric) randomized-response mechanism (Warner, 1965, Kasiviswanathan et al, 2008) and the newer (non-symmetric) mechanisms (Bassily and Smith, 2015, Bassily et al, 2017). First, we study the general framework of mapping each user's type into a signal and show that the problem of finding the maximum-likelihood distribution over the signals is feasible. Then we discuss the randomized-response mechanism and show that, in essence, it maps the null- and alternative-hypotheses onto new sets, an affine translation of the original sets. We then give sample complexity bounds for identity and independence testing under randomized-response. We then move to the newer non-symmetric mechanisms and show that there too the problem of finding the maximum-likelihood distribution is feasible. Under the mechanism of Bassily et al (2007) we give identity and independence testers with better sample complexity than the testers in the symmetric case, and we also propose a $χ^2$-based identity tester which we investigate empirically.