LGCRJul 25, 2023

Accuracy Improvement in Differentially Private Logistic Regression: A Pre-training Approach

arXiv:2307.13771v3h-index: 23
Originality Incremental advance
AI Analysis

This work addresses privacy concerns in machine learning for applications handling sensitive data, but it is incremental as it builds on existing DP methods with a pre-training approach.

The paper tackles the accuracy degradation in differentially private logistic regression by introducing a pre-training module on public data before fine-tuning on private data, resulting in significant accuracy improvements as shown in numerical results.

Machine learning (ML) models can memorize training datasets. As a result, training ML models over private datasets can lead to the violation of individuals' privacy. Differential privacy (DP) is a rigorous privacy notion to preserve the privacy of underlying training datasets. Yet, training ML models in a DP framework usually degrades the accuracy of ML models. This paper aims to boost the accuracy of a DP logistic regression (LR) via a pre-training module. In more detail, we initially pre-train our LR model on a public training dataset that there is no privacy concern about it. Then, we fine-tune our DP-LR model with the private dataset. In the numerical results, we show that adding a pre-training module significantly improves the accuracy of the DP-LR model.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes