CRLGMLJun 24, 2019

The Value of Collaboration in Convex Machine Learning with Differential Privacy

arXiv:1906.09679v1112 citations
Originality Incremental advance
AI Analysis

It addresses privacy-utility trade-offs for data owners in collaborative ML, but is incremental as it builds on existing differential privacy and gradient methods.

The paper tackles the problem of training machine learning models on distributed private data using differentially private gradients, showing that the fitness cost gap between private and non-private models is inversely proportional to the squared dataset size and squared privacy budget, validated on financial datasets for loan interest rate regression and credit card fraud detection.

In this paper, we apply machine learning to distributed private data owned by multiple data owners, entities with access to non-overlapping training datasets. We use noisy, differentially-private gradients to minimize the fitness cost of the machine learning model using stochastic gradient descent. We quantify the quality of the trained model, using the fitness cost, as a function of privacy budget and size of the distributed datasets to capture the trade-off between privacy and utility in machine learning. This way, we can predict the outcome of collaboration among privacy-aware data owners prior to executing potentially computationally-expensive machine learning algorithms. Particularly, we show that the difference between the fitness of the trained machine learning model using differentially-private gradient queries and the fitness of the trained machine model in the absence of any privacy concerns is inversely proportional to the size of the training datasets squared and the privacy budget squared. We successfully validate the performance prediction with the actual performance of the proposed privacy-aware learning algorithms, applied to: financial datasets for determining interest rates of loans using regression; and detecting credit card frauds using support vector machines.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes