Differentially Private Empirical Risk Minimization with Input Perturbation
This work addresses privacy concerns in machine learning for data contributors and servers by providing a decentralized privacy mechanism, though it builds incrementally on existing differential privacy methods.
The paper tackles the problem of differentially private empirical risk minimization by introducing an input perturbation framework where data contributors randomize their own data before submission, ensuring both local and global differential privacy. It shows that this approach achieves an excess risk bound of O(1/n), matching the state-of-the-art performance.
We propose a novel framework for the differentially private ERM, input perturbation. Existing differentially private ERM implicitly assumed that the data contributors submit their private data to a database expecting that the database invokes a differentially private mechanism for publication of the learned model. In input perturbation, each data contributor independently randomizes her/his data by itself and submits the perturbed data to the database. We show that the input perturbation framework theoretically guarantees that the model learned with the randomized data eventually satisfies differential privacy with the prescribed privacy parameters. At the same time, input perturbation guarantees that local differential privacy is guaranteed to the server. We also show that the excess risk bound of the model learned with input perturbation is $O(1/n)$ under a certain condition, where $n$ is the sample size. This is the same as the excess risk bound of the state-of-the-art.