Exploiting Unstructured Sparsity in Fully Homomorphic Encrypted DNNs
This addresses efficiency challenges for deploying privacy-sensitive DNNs, representing an incremental improvement in FHE optimization.
The paper tackles the computational overhead of fully homomorphic encryption (FHE) in deep neural networks by exploiting unstructured sparsity in matrix multiplication, achieving an average 2.5x performance gain at 50% sparsity and up to 32.5x with multi-threading.
The deployment of deep neural networks (DNNs) in privacy-sensitive environments is constrained by computational overheads in fully homomorphic encryption (FHE). This paper explores unstructured sparsity in FHE matrix multiplication schemes as a means of reducing this burden while maintaining model accuracy requirements. We demonstrate that sparsity can be exploited in arbitrary matrix multiplication, providing runtime benefits compared to a baseline naive algorithm at all sparsity levels. This is a notable departure from the plaintext domain, where there is a trade-off between sparsity and the overhead of the sparse multiplication algorithm. In addition, we propose three sparse multiplication schemes in FHE based on common plaintext sparse encodings. We demonstrate the performance gain is scheme-invariant; however, some sparse schemes vastly reduce the memory storage requirements of the encrypted matrix at high sparsity values. Our proposed sparse schemes yield an average performance gain of 2.5x at 50% unstructured sparsity, with our multi-threading scheme providing a 32.5x performance increase over the equivalent single-threaded sparse computation when utilizing 64 cores.