LGNov 9, 2021
DP-REC: Private & Communication-Efficient Federated LearningAleksei Triastcyn, Matthias Reisser, Christos Louizos
Privacy and communication efficiency are important challenges in federated training of neural networks, and combining them is still an open problem. In this work, we develop a method that unifies highly compressed communication and differential privacy (DP). We introduce a compression technique based on Relative Entropy Coding (REC) to the federated setting. With a minor modification to REC, we obtain a provably differentially private learning algorithm, DP-REC, and show how to compute its privacy guarantees. Our experiments demonstrate that DP-REC drastically reduces communication costs while providing privacy guarantees comparable to the state-of-the-art.
LGApr 23, 2021
Unsupervised Information Obfuscation for Split Inference of Neural NetworksMohammad Samragh, Hossein Hosseini, Aleksei Triastcyn et al.
Splitting network computations between the edge device and a server enables low edge-compute inference of neural networks but might expose sensitive information about the test query to the server. To address this problem, existing techniques train the model to minimize information leakage for a given set of sensitive attributes. In practice, however, the test queries might contain attributes that are not foreseen during training. We propose instead an unsupervised obfuscation method to discard the information irrelevant to the main task. We formulate the problem via an information theoretical framework and derive an analytical solution for a given distortion to the model output. In our method, the edge device runs the model up to a split layer determined based on its computational capacity. It then obfuscates the obtained feature vector based on the first layer of the server model by removing the components in the null space as well as the low-energy components of the remaining signal. Our experimental results show that our method outperforms existing techniques in removing the information of the irrelevant attributes and maintaining the accuracy on the target label. We also show that our method reduces the communication cost and incurs only a small computational overhead.
MANov 16, 2020
A Distributed Differentially Private Algorithm for Resource Allocation in Unboundedly Large SettingsPanayiotis Danassis, Aleksei Triastcyn, Boi Faltings
We introduce a practical and scalable algorithm (PALMA) for solving one of the fundamental problems of multi-agent systems -- finding matches and allocations -- in unboundedly large settings (e.g., resource allocation in urban environments, mobility-on-demand systems, etc.), while providing strong worst-case privacy guarantees. PALMA is decentralized, runs on-device, requires no inter-agent communication, and converges in constant time under reasonable assumptions. We evaluate PALMA in a mobility-on-demand and a paper assignment scenario, using real data in both, and demonstrate that it provides a strong level of privacy ($\varepsilon \leq 1$ and median as low as $\varepsilon = 0.5$ across agents) and high-quality matchings (up to $86\%$ of the non-private optimal, outperforming even the privacy-preserving centralized maximum-weight matching baseline).
MLMar 2, 2020
Generating Higher-Fidelity Synthetic Datasets with Privacy GuaranteesAleksei Triastcyn, Boi Faltings
This paper considers the problem of enhancing user privacy in common machine learning development tasks, such as data annotation and inspection, by substituting the real data with samples form a generative adversarial network. We propose employing Bayesian differential privacy as the means to achieve a rigorous theoretical guarantee while providing a better privacy-utility trade-off. We demonstrate experimentally that our approach produces higher-fidelity samples, compared to prior work, allowing to (1) detect more subtle data errors and biases, and (2) reduce the need for real data labelling by achieving high accuracy when training directly on artificial samples.
LGNov 22, 2019
Federated Learning with Bayesian Differential PrivacyAleksei Triastcyn, Boi Faltings
We consider the problem of reinforcing federated learning with formal privacy guarantees. We propose to employ Bayesian differential privacy, a relaxation of differential privacy for similarly distributed data, to provide sharper privacy loss bounds. We adapt the Bayesian privacy accounting method to the federated setting and suggest multiple improvements for more efficient privacy budgeting at different levels. Our experiments show significant advantage over the state-of-the-art differential privacy bounds for federated learning on image classification tasks, including a medical application, bringing the privacy budget below 1 at the client level, and below 0.1 at the instance level. Lower amounts of noise also benefit the model accuracy and reduce the number of communication rounds.
MLOct 18, 2019
Federated Generative PrivacyAleksei Triastcyn, Boi Faltings
In this paper, we propose FedGP, a framework for privacy-preserving data release in the federated learning setting. We use generative adversarial networks, generator components of which are trained by FedAvg algorithm, to draw privacy-preserving artificial data samples and empirically assess the risk of information disclosure. Our experiments show that FedGP is able to generate labelled data of high quality to successfully train and validate supervised models. Finally, we demonstrate that our approach significantly reduces vulnerability of such models to model inversion attacks.
LGJan 28, 2019
Bayesian Differential Privacy for Machine LearningAleksei Triastcyn, Boi Faltings
Traditional differential privacy is independent of the data distribution. However, this is not well-matched with the modern machine learning context, where models are trained on specific data. As a result, achieving meaningful privacy guarantees in ML often excessively reduces accuracy. We propose Bayesian differential privacy (BDP), which takes into account the data distribution to provide more practical privacy guarantees. We also derive a general privacy accounting method under BDP, building upon the well-known moments accountant. Our experiments demonstrate that in-distribution samples in classic machine learning datasets, such as MNIST and CIFAR-10, enjoy significantly stronger privacy guarantees than postulated by DP, while models maintain high classification accuracy.
LGMar 8, 2018
Generating Artificial Data for Private Deep LearningAleksei Triastcyn, Boi Faltings
In this paper, we propose generating artificial data that retain statistical properties of real data as the means of providing privacy with respect to the original dataset. We use generative adversarial network to draw privacy-preserving artificial data samples and derive an empirical method to assess the risk of information disclosure in a differential-privacy-like way. Our experiments show that we are able to generate artificial data of high quality and successfully train and validate machine learning models on this data while limiting potential privacy loss.