The Knowledge Within: Methods for Data-Free Model Compression
This work addresses data privacy and availability issues in model compression, enabling deployment in sensitive domains, though it is incremental as it builds on existing compression techniques.
The paper tackles the problem of compressing deep neural networks without access to the original training data, which is crucial for privacy-sensitive applications like medical and biometric use-cases, by presenting methods to generate synthetic samples from trained models, achieving negligible accuracy degradation compared to using real data.
Recently, an extensive amount of research has been focused on compressing and accelerating Deep Neural Networks (DNN). So far, high compression rate algorithms require part of the training dataset for a low precision calibration, or a fine-tuning process. However, this requirement is unacceptable when the data is unavailable or contains sensitive information, as in medical and biometric use-cases. We present three methods for generating synthetic samples from trained models. Then, we demonstrate how these samples can be used to calibrate and fine-tune quantized models without using any real data in the process. Our best performing method has a negligible accuracy degradation compared to the original training set. This method, which leverages intrinsic batch normalization layers' statistics of the trained model, can be used to evaluate data similarity. Our approach opens a path towards genuine data-free model compression, alleviating the need for training data during model deployment.