CRDec 6, 2018

When Homomorphic Cryptosystem Meets Differential Privacy: Training Machine Learning Classifier with Privacy Protection

Xiangyun Tang, Liehuang Zhu, Meng Shen, Xiaojiang Du

arXiv:1812.02292v18.56 citations

Originality Incremental advance

AI Analysis

This work addresses data privacy concerns for data providers in machine learning by offering a secure training solution, though it appears incremental as it builds on existing methods like homomorphic cryptosystem and differential privacy.

The paper tackles the challenge of balancing accuracy, computational efficiency, and security in privacy-preserving machine learning classifier training by proposing Heda, a scheme that combines homomorphic cryptosystem with differential privacy, enabling flexible trade-offs through parameter tuning and achieving effectiveness and efficiency as shown in experiments.

Machine learning (ML) classifiers are invaluable building blocks that have been used in many fields. High quality training dataset collected from multiple data providers is essential to train accurate classifiers. However, it raises concern about data privacy due to potential leakage of sensitive information in training dataset. Existing studies have proposed many solutions to privacy-preserving training of ML classifiers, but it remains a challenging task to strike a balance among accuracy, computational efficiency, and security. In this paper, we propose Heda, an efficient privacypreserving scheme for training ML classifiers. By combining homomorphic cryptosystem (HC) with differential privacy (DP), Heda obtains the tradeoffs between efficiency and accuracy, and enables flexible switch among different tradeoffs by parameter tuning. In order to make such combination efficient and feasible, we present novel designs based on both HC and DP: A library of building blocks based on partially HC are proposed to construct complex training algorithms without introducing a trusted thirdparty or computational relaxation; A set of theoretical methods are proposed to determine appropriate privacy budget and to reduce sensitivity. Security analysis demonstrates that our solution can construct complex ML training algorithm securely. Extensive experimental results show the effectiveness and efficiency of the proposed scheme.

View on arXiv PDF

Similar