CRJul 14, 2020

PrivColl: Practical Privacy-Preserving Collaborative Machine Learning

arXiv:2007.06953v137 citations
AI Analysis

This work addresses privacy concerns in collaborative learning for participants who need to share models without exposing data, offering a practical solution with significant efficiency gains, though it is incremental as it builds on existing secret sharing techniques.

The paper tackles the problem of privacy-preserving collaborative machine learning by proposing PrivColl, a framework that uses lightweight additive secret sharing to protect local data and models while maintaining training correctness, achieving speedups of over 45X for linear/logistic regression and 216X for neural networks compared to state-of-the-art methods.

Collaborative learning enables two or more participants, each with their own training dataset, to collaboratively learn a joint model. It is desirable that the collaboration should not cause the disclosure of either the raw datasets of each individual owner or the local model parameters trained on them. This privacy-preservation requirement has been approached through differential privacy mechanisms, homomorphic encryption (HE) and secure multiparty computation (MPC), but existing attempts may either introduce the loss of model accuracy or imply significant computational and/or communicational overhead. In this work, we address this problem with the lightweight additive secret sharing technique. We propose PrivColl, a framework for protecting local data and local models while ensuring the correctness of training processes. PrivColl employs secret sharing technique for securely evaluating addition operations in a multiparty computation environment, and achieves practicability by employing only the homomorphic addition operations. We formally prove that it guarantees privacy preservation even though the majority (n-2 out of n) of participants are corrupted. With experiments on real-world datasets, we further demonstrate that PrivColl retains high efficiency. It achieves a speedup of more than 45X over the state-of-the-art MPC/HE based schemes for training linear/logistic regression, and 216X faster for training neural network.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes