Low Latency Privacy Preserving Inference
This addresses the need for efficient and secure inference on sensitive data, representing an incremental advance over prior methods.
The paper tackles the problem of high latency and limited network size in privacy-preserving machine learning inference using Homomorphic Encryption, achieving over 10x latency improvement and enabling inference on wider networks with a second solution offering deep network inference in ~0.16 seconds.
When applying machine learning to sensitive data, one has to find a balance between accuracy, information security, and computational-complexity. Recent studies combined Homomorphic Encryption with neural networks to make inferences while protecting against information leakage. However, these methods are limited by the width and depth of neural networks that can be used (and hence the accuracy) and exhibit high latency even for relatively simple networks. In this study we provide two solutions that address these limitations. In the first solution, we present more than $10\times$ improvement in latency and enable inference on wider networks compared to prior attempts with the same level of security. The improved performance is achieved by novel methods to represent the data during the computation. In the second solution, we apply the method of transfer learning to provide private inference services using deep networks with latency of $\sim0.16$ seconds. We demonstrate the efficacy of our methods on several computer vision tasks.