CRMar 4, 2021
On the privacy-utility trade-off in differentially private hierarchical text classificationDominik Wunderlich, Daniel Bernau, Francesco Aldà et al.
Hierarchical text classification consists in classifying text documents into a hierarchy of classes and sub-classes. Although artificial neural networks have proved useful to perform this task, unfortunately they can leak training data information to adversaries due to training data memorization. Using differential privacy during model training can mitigate leakage attacks against trained models, enabling the models to be shared safely at the cost of reduced model accuracy. This work investigates the privacy-utility trade-off in hierarchical text classification with differential privacy guarantees, and identifies neural network architectures that offer superior trade-offs. To this end, we use a white-box membership inference attack to empirically assess the information leakage of three widely used neural network architectures. We show that large differential privacy parameters already suffice to completely mitigate membership inference attacks, thus resulting only in a moderate decrease in model utility. More specifically, for large datasets with long texts we observed Transformer-based models to achieve an overall favorable privacy-utility trade-off, while for smaller datasets with shorter texts convolutional neural networks are preferable.
CROct 3, 2017
Computational Differential Privacy from Lattice-based CryptographyFilipp Valovich, Francesco Aldà
The emerging technologies for large scale data analysis raise new challenges to the security and privacy of sensitive user data. In this work we investigate the problem of private statistical analysis of time-series data in the distributed and semi-honest setting. In particular, we study some properties of Private Stream Aggregation (PSA), first introduced by Shi et al. 2017. This is a computationally secure protocol for the collection and aggregation of data in a distributed network and has a very small communication cost. In the non-adaptive query model, a secure PSA scheme can be built upon any key-homomorphic weak pseudo-random function as shown by Valovich 2017, yielding security guarantees in the standard model which is in contrast to Shi et. al. We show that every mechanism which preserves $(ε,δ)$-differential privacy in effect preserves computational $(ε,δ)$-differential privacy when it is executed through a secure PSA scheme. Furthermore, we introduce a novel perturbation mechanism based on the symmetric Skellam distribution that is suited for preserving differential privacy in the distributed setting, and find that its performances in terms of privacy and accuracy are comparable to those of previous solutions. On the other hand, we leverage its specific properties to construct a computationally efficient prospective post-quantum protocol for differentially private time-series data analysis in the distributed model. The security of this protocol is based on the hardness of a new variant of the Decisional Learning with Errors (DLWE) problem. In this variant the errors are taken from the symmetric Skellam distribution. We show that this new variant is hard based on the hardness of the standard Learning with Errors (LWE) problem where the errors are taken from the discrete Gaussian distribution. Thus, we provide a variant of the LWE problem that is hard...
LGJun 8, 2017
Pain-Free Random Differential Privacy with Sensitivity SamplingBenjamin I. P. Rubinstein, Francesco Aldà
Popular approaches to differential privacy, such as the Laplace and exponential mechanisms, calibrate randomised smoothing through global sensitivity of the target non-private function. Bounding such sensitivity is often a prohibitively complex analytic calculation. As an alternative, we propose a straightforward sampler for estimating sensitivity of non-private mechanisms. Since our sensitivity estimates hold with high probability, any mechanism that would be $(ε,δ)$-differentially private under bounded global sensitivity automatically achieves $(ε,δ,γ)$-random differential privacy (Hall et al., 2012), without any target-specific calculations required. We demonstrate on worked example learners how our usable approach adopts a naturally-relaxed privacy guarantee, while achieving more accurate releases even for non-private functions that are black-box computer programs.
CRJul 29, 2015
Private Stream Aggregation RevisitedFilipp Valovich, Francesco Aldà
In this work, we investigate the problem of private statistical analysis in the distributed and semi-honest setting. In particular, we study properties of Private Stream Aggregation schemes, first introduced by Shi et al. \cite{2}. These are computationally secure protocols for the aggregation of data in a network and have a very small communication cost. We show that such schemes can be built upon any key-homomorphic \textit{weak} pseudo-random function. Thus, in contrast to the aforementioned work, our security definition can be achieved in the \textit{standard model}. In addition, we give a computationally efficient instantiation of this protocol based on the Decisional Diffie-Hellman problem. Moreover, we show that every mechanism which preserves $(ε,δ)$-differential privacy provides \textit{computational} $(ε,δ)$-differential privacy when it is executed through a Private Stream Aggregation scheme. Finally, we introduce a novel perturbation mechanism based on the \textit{Skellam distribution} that is suited for the distributed setting, and compare its performances with those of previous solutions.
DSJul 16, 2015
The Bernstein Mechanism: Function Release under Differential PrivacyFrancesco Aldà, Benjamin I. P. Rubinstein
We address the problem of general function release under differential privacy, by developing a functional mechanism that applies under the weak assumptions of oracle access to target function evaluation and sensitivity. These conditions permit treatment of functions described explicitly or implicitly as algorithmic black boxes. We achieve this result by leveraging the iterated Bernstein operator for polynomial approximation of the target function, and polynomial coefficient perturbation. Under weak regularity conditions, we establish fast rates on utility measured by high-probability uniform approximation. We provide a lower bound on the utility achievable for any functional mechanism that is $\varepsilon$-differentially private. The generality of our mechanism is demonstrated by the analysis of a number of example learners, including naive Bayes, non-parametric estimators and regularized empirical risk minimization. Competitive rates are demonstrated for kernel density estimation; and $\varepsilon$-differential privacy is achieved for a broader class of support vector machines than known previously.