CR LGJun 13, 2025

SecONNds: Secure Outsourced Neural Network Inference on ImageNet

arXiv:2506.11586v12 citationsh-index: 2Has Code

Originality Incremental advance

AI Analysis

This work addresses privacy challenges in outsourced AI for resource-constrained environments by providing a more efficient secure inference solution, though it is incremental as it builds on existing secure computation protocols.

The paper tackles the problem of high computational and communication costs in secure outsourced neural network inference by introducing SecONNds, a framework optimized for ImageNet-scale CNNs, achieving a 17× speedup in nonlinear operations and reducing communication to 420 MiB with inference times of 2.8 s on GPU and 3.6 s on CPU.

The widespread adoption of outsourced neural network inference presents significant privacy challenges, as sensitive user data is processed on untrusted remote servers. Secure inference offers a privacy-preserving solution, but existing frameworks suffer from high computational overhead and communication costs, rendering them impractical for real-world deployment. We introduce SecONNds, a non-intrusive secure inference framework optimized for large ImageNet-scale Convolutional Neural Networks. SecONNds integrates a novel fully Boolean Goldreich-Micali-Wigderson (GMW) protocol for secure comparison -- addressing Yao's millionaires' problem -- using preprocessed Beaver's bit triples generated from Silent Random Oblivious Transfer. Our novel protocol achieves an online speedup of 17$\times$ in nonlinear operations compared to state-of-the-art solutions while reducing communication overhead. To further enhance performance, SecONNds employs Number Theoretic Transform (NTT) preprocessing and leverages GPU acceleration for homomorphic encryption operations, resulting in speedups of 1.6$\times$ on CPU and 2.2$\times$ on GPU for linear operations. We also present SecONNds-P, a bit-exact variant that ensures verifiable full-precision results in secure computation, matching the results of plaintext computations. Evaluated on a 37-bit quantized SqueezeNet model, SecONNds achieves an end-to-end inference time of 2.8 s on GPU and 3.6 s on CPU, with a total communication of just 420 MiB. SecONNds' efficiency and reduced computational load make it well-suited for deploying privacy-sensitive applications in resource-constrained environments. SecONNds is open source and can be accessed from: https://github.com/shashankballa/SecONNds.

View on arXiv PDF Code

Similar