CRLGJul 24, 2017

Desensitized RDCA Subspaces for Compressive Privacy in Machine Learning

arXiv:1707.07770v12 citations
Originality Incremental advance
AI Analysis

It addresses privacy protection in machine learning for classification, offering a promising but incremental solution.

The paper tackles privacy in machine learning classification by using Ridge Discriminant Component Analysis (RDCA) to desensitize data, achieving near-random guess accuracies for privacy labels with small utility drops, such as 5.14% and 0.04% on HAR and CMU Faces datasets.

The quest for better data analysis and artificial intelligence has lead to more and more data being collected and stored. As a consequence, more data are exposed to malicious entities. This paper examines the problem of privacy in machine learning for classification. We utilize the Ridge Discriminant Component Analysis (RDCA) to desensitize data with respect to a privacy label. Based on five experiments, we show that desensitization by RDCA can effectively protect privacy (i.e. low accuracy on the privacy label) with small loss in utility. On HAR and CMU Faces datasets, the use of desensitized data results in random guess level accuracies for privacy at a cost of 5.14% and 0.04%, on average, drop in the utility accuracies. For Semeion Handwritten Digit dataset, accuracies of the privacy-sensitive digits are almost zero, while the accuracies for the utility-relevant digits drop by 7.53% on average. This presents a promising solution to the problem of privacy in machine learning for classification.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes