Privacy-aware Gaussian Process Regression
This work addresses privacy concerns for data owners in supervised learning, though it is incremental as it builds on existing Gaussian process methods with privacy enhancements.
The authors tackled the problem of sharing Gaussian process regression models under privacy constraints by adding synthetic noise to data to meet a specified privacy level, achieving this through semi-definite programming and kernel-based approaches, with applications in satellite trajectory tracking and census data.
We propose a novel theoretical and methodological framework for Gaussian process regression subject to privacy constraints. The proposed method can be used when a data owner is unwilling to share a high-fidelity supervised learning model built from their data with the public due to privacy concerns. The key idea of the proposed method is to add synthetic noise to the data until the predictive variance of the Gaussian process model reaches a prespecified privacy level. The optimal covariance matrix of the synthetic noise is formulated in terms of semi-definite programming. We also introduce the formulation of privacy-aware solutions under continuous privacy constraints using kernel-based approaches, and study their theoretical properties. The proposed method is illustrated by considering a model that tracks the trajectories of satellites and a real application on a census dataset.