LG CRDec 1, 2022

Purifier: Defending Data Inference Attacks via Transforming Confidence Scores

Ziqi Yang, Lijin Wang, Da Yang, Jie Wan, Ziming Zhao, Ee-Chien Chang, Fan Zhang, Kui Ren

arXiv:2212.00612v114.120 citationsh-index: 54

Originality Incremental advance

AI Analysis

This addresses security vulnerabilities in neural networks for users concerned about privacy and data leakage, representing an incremental improvement over previous defense methods.

The paper tackles the problem of defending neural networks against data inference attacks by proposing PURIFIER, a method that transforms confidence scores to make them indistinguishable between members and non-members, resulting in high effectiveness in defending membership inference attacks with negligible utility loss and also showing effectiveness against adversarial model inversion and attribute inference attacks, such as raising inversion error about 4+ times on the Facescrub530 classifier.

Neural networks are susceptible to data inference attacks such as the membership inference attack, the adversarial model inversion attack and the attribute inference attack, where the attacker could infer useful information such as the membership, the reconstruction or the sensitive attributes of a data sample from the confidence scores predicted by the target classifier. In this paper, we propose a method, namely PURIFIER, to defend against membership inference attacks. It transforms the confidence score vectors predicted by the target classifier and makes purified confidence scores indistinguishable in individual shape, statistical distribution and prediction label between members and non-members. The experimental results show that PURIFIER helps defend membership inference attacks with high effectiveness and efficiency, outperforming previous defense methods, and also incurs negligible utility loss. Besides, our further experiments show that PURIFIER is also effective in defending adversarial model inversion attacks and attribute inference attacks. For example, the inversion error is raised about 4+ times on the Facescrub530 classifier, and the attribute inference accuracy drops significantly when PURIFIER is deployed in our experiment.

View on arXiv PDF

Similar