Eternal Sunshine of the Spotless Net: Selective Forgetting in Deep Networks
This addresses privacy and compliance issues for users needing to forget sensitive data in trained models, but it is incremental as it builds on differential privacy and stability concepts.
The paper tackles the problem of selectively removing information about specific training data from deep neural network weights without retraining, proposing a method that ensures any probing function of the weights is indistinguishable from a network trained without that data, with an efficient upper-bound on remaining information.
We explore the problem of selectively forgetting a particular subset of the data used for training a deep neural network. While the effects of the data to be forgotten can be hidden from the output of the network, insights may still be gleaned by probing deep into its weights. We propose a method for "scrubbing'" the weights clean of information about a particular set of training data. The method does not require retraining from scratch, nor access to the data originally used for training. Instead, the weights are modified so that any probing function of the weights is indistinguishable from the same function applied to the weights of a network trained without the data to be forgotten. This condition is a generalized and weaker form of Differential Privacy. Exploiting ideas related to the stability of stochastic gradient descent, we introduce an upper-bound on the amount of information remaining in the weights, which can be estimated efficiently even for deep neural networks.