LGCRDec 11, 2024

Training Data Reconstruction: Privacy due to Uncertainty?

arXiv:2412.08544v13 citationsh-index: 332025 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
Originality Incremental advance
AI Analysis

This work addresses privacy risks in machine learning for data owners, but it is incremental as it builds on prior reconstruction methods.

The paper tackles the problem of training data reconstruction from neural network parameters, a privacy concern, and finds that reconstructions can resemble valid training samples without being actual training data, making it difficult for an adversary to identify true training samples.

Being able to reconstruct training data from the parameters of a neural network is a major privacy concern. Previous works have shown that reconstructing training data, under certain circumstances, is possible. In this work, we analyse such reconstructions empirically and propose a new formulation of the reconstruction as a solution to a bilevel optimisation problem. We demonstrate that our formulation as well as previous approaches highly depend on the initialisation of the training images $x$ to reconstruct. In particular, we show that a random initialisation of $x$ can lead to reconstructions that resemble valid training samples while not being part of the actual training dataset. Thus, our experiments on affine and one-hidden layer networks suggest that when reconstructing natural images, yet an adversary cannot identify whether reconstructed images have indeed been part of the set of training samples.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes