LGCRNov 23, 2020

Effectiveness of MPC-friendly Softmax Replacement

arXiv:2011.11202v210 citations
AI Analysis

This work addresses the problem of efficient and accurate softmax computation in multi-party computation for researchers and practitioners in secure deep learning.

This paper investigates the effectiveness of an MPC-friendly softmax replacement, finding that it only offers significant speed-up for single-layer networks while consistently reducing accuracy. The authors conclude that its utility is limited compared to the original softmax function.

Softmax is widely used in deep learning to map some representation to a probability distribution. As it is based on exp/log functions that are relatively expensive in multi-party computation, Mohassel and Zhang (2017) proposed a simpler replacement based on ReLU to be used in secure computation. However, we could not reproduce the accuracy they reported for training on MNIST with three fully connected layers. Later works (e.g., Wagh et al., 2019 and 2021) used the softmax replacement not for computing the output probability distribution but for approximating the gradient in back-propagation. In this work, we analyze the two uses of the replacement and compare them to softmax, both in terms of accuracy and cost in multi-party computation. We found that the replacement only provides a significant speed-up for a one-layer network while it always reduces accuracy, sometimes significantly. Thus we conclude that its usefulness is limited and one should use the original softmax function instead.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes