CVOct 22, 2024

Network Inversion for Training-Like Data Reconstruction

arXiv:2410.16884v13.71 citationsh-index: 2

Originality Incremental advance

AI Analysis

This work addresses privacy concerns for organizations and individuals sharing machine learning models by showing that training data can be inferred, posing a risk to data confidentiality.

The paper tackles the problem of reconstructing training-like data from trained models, presenting TLDR, a network inversion approach that exploits classifier properties and prior knowledge to generate realistic images, demonstrating privacy risks in model sharing.

Machine Learning models are often trained on proprietary and private data that cannot be shared, though the trained models themselves are distributed openly assuming that sharing model weights is privacy preserving, as training data is not expected to be inferred from the model weights. In this paper, we present Training-Like Data Reconstruction (TLDR), a network inversion-based approach to reconstruct training-like data from trained models. To begin with, we introduce a comprehensive network inversion technique that learns the input space corresponding to different classes in the classifier using a single conditioned generator. While inversion may typically return random and arbitrary input images for a given output label, we modify the inversion process to incentivize the generator to reconstruct training-like data by exploiting key properties of the classifier with respect to the training data along with some prior knowledge about the images. To validate our approach, we conduct empirical evaluations on multiple standard vision classification datasets, thereby highlighting the potential privacy risks involved in sharing machine learning models.

View on arXiv PDF

Similar