LGCRMLJan 8, 2019

Data Masking with Privacy Guarantees

arXiv:1901.02185v12 citations
Originality Incremental advance
AI Analysis

This addresses privacy concerns in sensitive domains like healthcare by enabling usable data release, but it is incremental as it builds on existing privacy methods.

The paper tackles the problem of data release with privacy by proposing a masking method that ensures privacy guarantees while maintaining classifier similarity to original data, achieving lower risk than input perturbation, especially with large training samples, as demonstrated on 12 benchmark datasets.

We study the problem of data release with privacy, where data is made available with privacy guarantees while keeping the usability of the data as high as possible --- this is important in health-care and other domains with sensitive data. In particular, we propose a method of masking the private data with privacy guarantee while ensuring that a classifier trained on the masked data is similar to the classifier trained on the original data, to maintain usability. We analyze the theoretical risks of the proposed method and the traditional input perturbation method. Results show that the proposed method achieves lower risk compared to the input perturbation, especially when the number of training samples gets large. We illustrate the effectiveness of the proposed method of data masking for privacy-sensitive learning on $12$ benchmark datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes