Partially Oblivious Neural Network Inference
This work addresses the trade-off between security and efficiency for privacy-preserving machine learning, offering a practical solution for scenarios where some information leakage is acceptable, though it is incremental in nature.
The paper tackles the efficiency limitations of secure oblivious inference for neural networks by introducing partially oblivious inference, which allows controlled leakage of model weights. They demonstrate that leaking up to 80% of weights in a CIFAR-10 network has minimal security impact while speeding up homomorphic encryption multiplications by four times.
Oblivious inference is the task of outsourcing a ML model, like neural-networks, without disclosing critical and sensitive information, like the model's parameters. One of the most prominent solutions for secure oblivious inference is based on a powerful cryptographic tools, like Homomorphic Encryption (HE) and/or multi-party computation (MPC). Even though the implementation of oblivious inference systems schemes has impressively improved the last decade, there are still significant limitations on the ML models that they can practically implement. Especially when both the ML model and the input data's confidentiality must be protected. In this paper, we introduce the notion of partially oblivious inference. We empirically show that for neural network models, like CNNs, some information leakage can be acceptable. We therefore propose a novel trade-off between security and efficiency. In our research, we investigate the impact on security and inference runtime performance from the CNN model's weights partial leakage. We experimentally demonstrate that in a CIFAR-10 network we can leak up to $80\%$ of the model's weights with practically no security impact, while the necessary HE-mutliplications are performed four times faster.