QuantFace: Towards Lightweight Face Recognition by Synthetic Data Low-bit Quantization
This work addresses privacy and computational efficiency issues in face recognition for deployment in resource-limited environments, representing an incremental improvement over existing quantization methods.
The paper tackles the problem of deploying deep learning-based face recognition models in computationally constrained scenarios by proposing QuantFace, a solution that uses low-bit precision quantization with synthetic data, achieving up to a 5x reduction in model size while largely maintaining verification performance without accessing real training data.
Deep learning-based face recognition models follow the common trend in deep neural networks by utilizing full-precision floating-point networks with high computational costs. Deploying such networks in use-cases constrained by computational requirements is often infeasible due to the large memory required by the full-precision model. Previous compact face recognition approaches proposed to design special compact architectures and train them from scratch using real training data, which may not be available in a real-world scenario due to privacy concerns. We present in this work the QuantFace solution based on low-bit precision format model quantization. QuantFace reduces the required computational cost of the existing face recognition models without the need for designing a particular architecture or accessing real training data. QuantFace introduces privacy-friendly synthetic face data to the quantization process to mitigate potential privacy concerns and issues related to the accessibility to real training data. Through extensive evaluation experiments on seven benchmarks and four network architectures, we demonstrate that QuantFace can successfully reduce the model size up to 5x while maintaining, to a large degree, the verification performance of the full-precision model without accessing real training datasets.