Privacy-Preserving Multiparty Learning For Logistic Regression
This addresses privacy concerns for organizations or individuals sharing sensitive data in collaborative learning, though it is incremental as it applies existing differential privacy techniques to a multiparty logistic regression setting.
The paper tackles the problem of privacy leakage when multiple parties share data for collaborative machine learning, proposing a framework that enables accurate logistic regression training while guaranteeing ε-differential privacy, with experimental results showing high efficiency and accuracy on real datasets like Bank Marketing and Credit Card Default prediction.
In recent years, machine learning techniques are widely used in numerous applications, such as weather forecast, financial data analysis, spam filtering, and medical prediction. In the meantime, massive data generated from multiple sources further improve the performance of machine learning tools. However, data sharing from multiple sources brings privacy issues for those sources since sensitive information may be leaked in this process. In this paper, we propose a framework enabling multiple parties to collaboratively and accurately train a learning model over distributed datasets while guaranteeing the privacy of data sources. Specifically, we consider logistic regression model for data training and propose two approaches for perturbing the objective function to preserve ε-differential privacy. The proposed solutions are tested on real datasets, including Bank Marketing and Credit Card Default prediction. Experimental results demonstrate that the proposed multiparty learning framework is highly efficient and accurate.