LG AI CRJun 8, 2020

Privacy Adversarial Network: Representation Learning for Mobile Data Privacy

Sicong Liu, Junzhao Du, Anshumali Shrivastava, Lin Zhong

arXiv:2006.06535v114.016 citations

Originality Highly original

AI Analysis

This work addresses privacy concerns for mobile users in machine learning services, presenting a novel method that improves upon prior incremental approaches.

The paper tackles the problem of balancing service utility and data privacy in cloud-based mobile services by proposing a privacy adversarial network (PAN) that uses adversarial learning to automatically learn representations from raw data, achieving better utility and privacy simultaneously as demonstrated on six datasets.

The remarkable success of machine learning has fostered a growing number of cloud-based intelligent services for mobile users. Such a service requires a user to send data, e.g. image, voice and video, to the provider, which presents a serious challenge to user privacy. To address this, prior works either obfuscate the data, e.g. add noise and remove identity information, or send representations extracted from the data, e.g. anonymized features. They struggle to balance between the service utility and data privacy because obfuscated data reduces utility and extracted representation may still reveal sensitive information. This work departs from prior works in methodology: we leverage adversarial learning to a better balance between privacy and utility. We design a \textit{representation encoder} that generates the feature representations to optimize against the privacy disclosure risk of sensitive information (a measure of privacy) by the \textit{privacy adversaries}, and concurrently optimize with the task inference accuracy (a measure of utility) by the \textit{utility discriminator}. The result is the privacy adversarial network (\systemname), a novel deep model with the new training algorithm, that can automatically learn representations from the raw data. Intuitively, PAN adversarially forces the extracted representations to only convey the information required by the target task. Surprisingly, this constitutes an implicit regularization that actually improves task accuracy. As a result, PAN achieves better utility and better privacy at the same time! We report extensive experiments on six popular datasets and demonstrate the superiority of \systemname compared with alternative methods reported in prior work.

View on arXiv PDF

Similar