Deconstructing Data Reconstruction: Multiclass, Weight Decay and General Losses
This work addresses the problem of understanding data memorization in neural networks for researchers in machine learning security and privacy, but it is incremental as it builds on existing reconstruction methods.
The authors extended prior work on reconstructing training data from neural network parameters to multiclass and convolutional networks, deriving a more general scheme applicable to various loss functions, and found that weight decay increases reconstructability in quantity and quality.
Memorization of training data is an active research area, yet our understanding of the inner workings of neural networks is still in its infancy. Recently, Haim et al. (2022) proposed a scheme to reconstruct training samples from multilayer perceptron binary classifiers, effectively demonstrating that a large portion of training samples are encoded in the parameters of such networks. In this work, we extend their findings in several directions, including reconstruction from multiclass and convolutional neural networks. We derive a more general reconstruction scheme which is applicable to a wider range of loss functions such as regression losses. Moreover, we study the various factors that contribute to networks' susceptibility to such reconstruction schemes. Intriguingly, we observe that using weight decay during training increases reconstructability both in terms of quantity and quality. Additionally, we examine the influence of the number of neurons relative to the number of training samples on the reconstructability. Code: https://github.com/gonbuzaglo/decoreco