Marcel Keller

5papers

407citations

Novelty34%

AI Score22

Ranked #187,939 of 205,806 authors (top 91%)#6,005 in CR (top 82%)

5 Papers

LGJul 1, 2021

Secure Quantized Training for Deep Learning

Marcel Keller, Ke Sun

We implement training of neural networks in secure multi-party computation (MPC) using quantization commonly used in said setting. We are the first to present an MNIST classifier purely trained in MPC that comes within 0.2 percent of the accuracy of the same convolutional neural network trained via plaintext computation. More concretely, we have trained a network with two convolutional and two dense layers to 99.2% accuracy in 3.5 hours (under one hour for 99% accuracy). We have also implemented AlexNet for CIFAR-10, which converges in a few hours. We develop novel protocols for exponentiation and inverse square root. Finally, we present experiments in a range of MPC security models for up to ten parties, both with honest and dishonest majority as well as semi-honest and malicious security.

CLMar 26, 2021

Data Augmentation in Natural Language Processing: A Novel Text Generation Approach for Long and Short Text Classifiers

Markus Bayer, Marc-André Kaufhold, Björn Buchhold et al.

In many cases of machine learning, research suggests that the development of training data might have a higher relevance than the choice and modelling of classifiers themselves. Thus, data augmentation methods have been developed to improve classifiers by artificially created training data. In NLP, there is the challenge of establishing universal rules for text transformations which provide new linguistic patterns. In this paper, we present and evaluate a text generation method suitable to increase the performance of classifiers for long and short texts. We achieved promising improvements when evaluating short as well as long text tasks with the enhancement by our text generation method. Especially with regard to small data analytics, additive accuracy gains of up to 15.53% and 3.56% are achieved within a constructed low data regime, compared to the no augmentation baseline and another data augmentation technique. As the current track of these constructed regimes is not universally applicable, we also show major improvements in several real world low data tasks (up to +4.84 F1-score). Since we are evaluating the method from many perspectives (in total 11 datasets), we also observe situations where the method might not be suitable. We discuss implications and patterns for the successful application of our approach on different types of datasets.

LGNov 23, 2020

Effectiveness of MPC-friendly Softmax Replacement

Marcel Keller, Ke Sun

Softmax is widely used in deep learning to map some representation to a probability distribution. As it is based on exp/log functions that are relatively expensive in multi-party computation, Mohassel and Zhang (2017) proposed a simpler replacement based on ReLU to be used in secure computation. However, we could not reproduce the accuracy they reported for training on MNIST with three fully connected layers. Later works (e.g., Wagh et al., 2019 and 2021) used the softmax replacement not for computing the output probability distribution but for approximating the gradient in back-propagation. In this work, we analyze the two uses of the replacement and compare them to softmax, both in terms of accuracy and cost in multi-party computation. We found that the replacement only provides a significant speed-up for a one-layer network while it always reduces accuracy, sometimes significantly. Thus we conclude that its usefulness is limited and one should use the original softmax function instead.

CROct 28, 2019

Secure Evaluation of Quantized Neural Networks

Anders Dalskov, Daniel Escudero, Marcel Keller

We investigate two questions in this paper: First, we ask to what extent "MPC friendly" models are already supported by major Machine Learning frameworks such as TensorFlow or PyTorch. Prior works provide protocols that only work on fixed-point integers and specialized activation functions, two aspects that are not supported by popular Machine Learning frameworks, and the need for these specialized model representations means that it is hard, and often impossible, to use e.g., TensorFlow to design, train and test models that later have to be evaluated securely. Second, we ask to what extent the functionality for evaluating Neural Networks already exists in general-purpose MPC frameworks. These frameworks have received more scrutiny, are better documented and supported on more platforms. Furthermore, they are typically flexible in terms of the threat model they support. In contrast, most secure evaluation protocols in the literature are targeted to a specific threat model and their implementations are only a "proof-of-concept", making it very hard for their adoption in practice. We answer both of the above questions in a positive way: We observe that the quantization techniques supported by both TensorFlow, PyTorch and MXNet can provide models in a representation that can be evaluated securely; and moreover, that this evaluation can be performed by a general purpose MPC framework. We perform extensive benchmarks to understand the exact trade-offs between different corruption models, network sizes and efficiency. These experiments provide an interesting insight into cost between active and passive security, as well as honest and dishonest majority. Our work shows then that the separating line between existing ML frameworks and existing MPC protocols may be narrower than implicitly suggested by previous works.

CROct 24, 2019

A Note on Our Submission to Track 4 of iDASH 2019

Marcel Keller, Ke Sun

iDASH is a competition soliciting implementations of cryptographic schemes of interest in the context of biology. In 2019, one track asked for multi-party computation implementations of training of a machine learning model suitable for two datasets from cancer research. In this note, we describe our solution submitted to the competition. We found that the training can be run on three AWS c5.9xlarge instances in less then one minute using MPC tolerating one semi-honest corruption, and less than ten seconds at a slightly lower accuracy.