Hiding in the Crowd: A Massively Distributed Algorithm for Private Averaging with Malicious Adversaries
This addresses privacy concerns in distributed machine learning for users of connected devices, offering a novel solution with malicious adversary protection.
The paper tackles the problem of privately computing averages over joint data from a large set of users for machine learning, proposing a massively distributed protocol that achieves arbitrary accuracy without a third party and protects privacy against both honest-but-curious and malicious adversaries, with privacy improving with network size.
The amount of personal data collected in our everyday interactions with connected devices offers great opportunities for innovative services fueled by machine learning, as well as raises serious concerns for the privacy of individuals. In this paper, we propose a massively distributed protocol for a large set of users to privately compute averages over their joint data, which can then be used to learn predictive models. Our protocol can find a solution of arbitrary accuracy, does not rely on a third party and preserves the privacy of users throughout the execution in both the honest-but-curious and malicious adversary models. Specifically, we prove that the information observed by the adversary (the set of maliciours users) does not significantly reduce the uncertainty in its prediction of private values compared to its prior belief. The level of privacy protection depends on a quantity related to the Laplacian matrix of the network graph and generally improves with the size of the graph. Furthermore, we design a verification procedure which offers protection against malicious users joining the service with the goal of manipulating the outcome of the algorithm.