Efficient Softmax Reformulation for Homomorphic Encryption via Moment Generating Function
This addresses a bottleneck in privacy-preserving machine learning for encrypted inference, though it is an incremental improvement over existing methods.
The paper tackled the challenge of evaluating softmax in homomorphic encryption, which is computationally expensive, by proposing MGF-softmax, a reformulation that reduces multiplicative depth while maintaining accuracy close to exact methods.
Homomorphic encryption (HE) is a prominent framework for privacy-preserving machine learning, enabling inference directly on encrypted data. However, evaluating softmax, a core component of transformer architectures, remains particularly challenging in HE due to its multivariate structure, the large dynamic range induced by exponential functions, and the need for accurate division during normalization. In this paper, we propose MGF-softmax, a novel softmax reformulation based on the moment generating function (MGF) that replaces the softmax denominator with its moment-based counterpart. This reformulation substantially reduces multiplicative depth while preserving key properties of softmax and asymptotically converging to the exact softmax as the number of input tokens increases. Extensive experiments on Vision Transformers and large language models show that MGF-softmax provides an efficient and accurate approximation of softmax in encrypted inference. In particular, it achieves inference accuracy close to that of high-depth exact methods, while requiring substantially lower computational cost through reduced multiplicative depth.