Xiangyun Tang

CRSep 21, 2020

Privacy-Preserving Machine Learning Training in Aggregation Scenarios

Liehuang Zhu, Xiangyun Tang, Meng Shen et al.

To develop Smart City, the growing popularity of Machine Learning (ML) that appreciates high-quality training datasets generated from diverse IoT devices raises natural questions about the privacy guarantees that can be provided in such settings. Privacy-preserving ML training in an aggregation scenario enables a model demander to securely train ML models with the sensitive IoT data gathered from personal IoT devices. Existing solutions are generally server-aided, cannot deal with the collusion threat between the servers or between the servers and data owners, and do not match the delicate environments of IoT. We propose a privacy-preserving ML training framework named Heda that consists of a library of building blocks based on partial homomorphic encryption (PHE) enabling constructing multiple privacy-preserving ML training protocols for the aggregation scenario without the assistance of untrusted servers and defending the security under collusion situations. Rigorous security analysis demonstrates the proposed protocols can protect the privacy of each participant in the honest-but-curious model and defend the security under most collusion situations. Extensive experiments validate the efficiency of Heda which achieves the privacy-preserving ML training without losing the model accuracy.

CRDec 6, 2018

When Homomorphic Cryptosystem Meets Differential Privacy: Training Machine Learning Classifier with Privacy Protection

Xiangyun Tang, Liehuang Zhu, Meng Shen et al.

Machine learning (ML) classifiers are invaluable building blocks that have been used in many fields. High quality training dataset collected from multiple data providers is essential to train accurate classifiers. However, it raises concern about data privacy due to potential leakage of sensitive information in training dataset. Existing studies have proposed many solutions to privacy-preserving training of ML classifiers, but it remains a challenging task to strike a balance among accuracy, computational efficiency, and security. In this paper, we propose Heda, an efficient privacypreserving scheme for training ML classifiers. By combining homomorphic cryptosystem (HC) with differential privacy (DP), Heda obtains the tradeoffs between efficiency and accuracy, and enables flexible switch among different tradeoffs by parameter tuning. In order to make such combination efficient and feasible, we present novel designs based on both HC and DP: A library of building blocks based on partially HC are proposed to construct complex training algorithms without introducing a trusted thirdparty or computational relaxation; A set of theoretical methods are proposed to determine appropriate privacy budget and to reduce sensitivity. Security analysis demonstrates that our solution can construct complex ML training algorithm securely. Extensive experimental results show the effectiveness and efficiency of the proposed scheme.

Xiangyun Tang

2 Papers