AIJul 17, 2023

Unstoppable Attack: Label-Only Model Inversion via Conditional Diffusion Model

arXiv:2307.08424v331 citationsh-index: 22
Originality Highly original
AI Analysis

This addresses privacy threats for machine learning models in practical black-box settings, representing a novel advancement in attack methods.

The paper tackles the problem of model inversion attacks in label-only scenarios, where attackers only have access to output labels, by developing a method using a conditional diffusion model to recover private training data, achieving superior performance in generating accurate samples compared to previous approaches.

Model inversion attacks (MIAs) aim to recover private data from inaccessible training sets of deep learning models, posing a privacy threat. MIAs primarily focus on the white-box scenario where attackers have full access to the model's structure and parameters. However, practical applications are usually in black-box scenarios or label-only scenarios, i.e., the attackers can only obtain the output confidence vectors or labels by accessing the model. Therefore, the attack models in existing MIAs are difficult to effectively train with the knowledge of the target model, resulting in sub-optimal attacks. To the best of our knowledge, we pioneer the research of a powerful and practical attack model in the label-only scenario. In this paper, we develop a novel MIA method, leveraging a conditional diffusion model (CDM) to recover representative samples under the target label from the training set. Two techniques are introduced: selecting an auxiliary dataset relevant to the target model task and using predicted labels as conditions to guide training CDM; and inputting target label, pre-defined guidance strength, and random noise into the trained attack model to generate and correct multiple results for final selection. This method is evaluated using Learned Perceptual Image Patch Similarity as a new metric and as a judgment basis for deciding the values of hyper-parameters. Experimental results show that this method can generate similar and accurate samples to the target label, outperforming generators of previous approaches.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes