Towards Characterizing and Limiting Information Exposure in DNN Layers
This addresses privacy risks for users of DNN-based prediction services on devices, but it is incremental as it builds on generalization error concepts and existing protection methods.
The paper tackles the problem of sensitive information exposure in pre-trained DNN layers by proposing a framework to measure memorization, finding that last layers encode more information and convolutional layers expose more than fully connected ones. It evaluates an architecture to protect sensitive layers in TEEs against membership inference attacks without high computational overhead.
Pre-trained Deep Neural Network (DNN) models are increasingly used in smartphones and other user devices to enable prediction services, leading to potential disclosures of (sensitive) information from training data captured inside these models. Based on the concept of generalization error, we propose a framework to measure the amount of sensitive information memorized in each layer of a DNN. Our results show that, when considered individually, the last layers encode a larger amount of information from the training data compared to the first layers. We find that, while the neuron of convolutional layers can expose more (sensitive) information than that of fully connected layers, the same DNN architecture trained with different datasets has similar exposure per layer. We evaluate an architecture to protect the most sensitive layers within the memory limits of Trusted Execution Environment (TEE) against potential white-box membership inference attacks without the significant computational overhead.