CRJul 11, 2022
SALSA: Attacking Lattice Cryptography with TransformersEmily Wenger, Mingjie Chen, François Charton et al.
Currently deployed public-key cryptosystems will be vulnerable to attacks by full-scale quantum computers. Consequently, "quantum resistant" cryptosystems are in high demand, and lattice-based cryptosystems, based on a hard problem known as Learning With Errors (LWE), have emerged as strong contenders for standardization. In this work, we train transformers to perform modular arithmetic and combine half-trained models with statistical cryptanalysis techniques to propose SALSA: a machine learning attack on LWE-based cryptographic schemes. SALSA can fully recover secrets for small-to-mid size LWE instances with sparse binary secrets, and may scale to attack real-world LWE-based cryptosystems.
CRMar 7, 2023
SALSA PICANTE: a machine learning attack on LWE with binary secretsCathy Li, Jana Sotáková, Emily Wenger et al.
Learning with Errors (LWE) is a hard math problem underpinning many proposed post-quantum cryptographic (PQC) systems. The only PQC Key Exchange Mechanism (KEM) standardized by NIST is based on module~LWE, and current publicly available PQ Homomorphic Encryption (HE) libraries are based on ring LWE. The security of LWE-based PQ cryptosystems is critical, but certain implementation choices could weaken them. One such choice is sparse binary secrets, desirable for PQ HE schemes for efficiency reasons. Prior work, SALSA, demonstrated a machine learning-based attack on LWE with sparse binary secrets in small dimensions ($n \le 128$) and low Hamming weights ($h \le 4$). However, this attack assumes access to millions of eavesdropped LWE samples and fails at higher Hamming weights or dimensions. We present PICANTE, an enhanced machine learning attack on LWE with sparse binary secrets, which recovers secrets in much larger dimensions (up to $n=350$) and with larger Hamming weights (roughly $n/10$, and up to $h=60$ for $n=350$). We achieve this dramatic improvement via a novel preprocessing step, which allows us to generate training data from a linear number of eavesdropped LWE samples ($4n$) and changes the distribution of the data to improve transformer training. We also improve the secret recovery methods of SALSA and introduce a novel cross-attention recovery mechanism allowing us to read off the secret directly from the trained models. While PICANTE does not threaten NIST's proposed LWE standards, it demonstrates significant improvement over SALSA and could scale further, highlighting the need for future investigation into machine learning attacks on LWE with sparse binary secrets.
35.5CRApr 5
Improving ML Attacks on LWE with Data Repetition and Stepwise RegressionAlberto Alfarano, Eshika Saxena, Emily Wenger et al.
The Learning with Errors (LWE) problem is a hard math problem in lattice-based cryptography. In the simplest case of binary secrets, it is the subset sum problem, with error. Effective ML attacks on LWE were demonstrated in the case of binary, ternary, and small secrets, succeeding on fairly sparse secrets. The ML attacks recover secrets with up to 3 active bits in the "cruel region" (Nolte et al., 2024) on samples pre-processed with BKZ. We show that using larger training sets and repeated examples enables recovery of denser secrets. Empirically, we observe a power-law relationship between model-based attempts to recover the secrets, dataset size, and repeated examples. We introduce a stepwise regression technique to recover the "cool bits" of the secret.
AIDec 20, 2024
Formal Mathematical Reasoning: A New Frontier in AIKaiyu Yang, Gabriel Poesia, Jingxuan He et al.
AI for Mathematics (AI4Math) is not only intriguing intellectually but also crucial for AI-driven discovery in science, engineering, and beyond. Extensive efforts on AI4Math have mirrored techniques in NLP, in particular, training large language models on carefully curated math datasets in text form. As a complementary yet less explored avenue, formal mathematical reasoning is grounded in formal systems such as proof assistants, which can verify the correctness of reasoning and provide automatic feedback. In this position paper, we advocate for formal mathematical reasoning and argue that it is indispensable for advancing AI4Math to the next level. In recent years, we have seen steady progress in using AI to perform formal reasoning, including core tasks such as theorem proving and autoformalization, as well as emerging applications such as verifiable generation of code and hardware designs. However, significant challenges remain to be solved for AI to truly master mathematics and achieve broader impact. We summarize existing progress, discuss open challenges, and envision critical milestones to measure future success. At this inflection point for formal mathematical reasoning, we call on the research community to come together to drive transformative advancements in this field.
CRFeb 2, 2024
Salsa Fresca: Angular Embeddings and Pre-Training for ML Attacks on Learning With ErrorsSamuel Stevens, Emily Wenger, Cathy Li et al.
Learning with Errors (LWE) is a hard math problem underlying recently standardized post-quantum cryptography (PQC) systems for key exchange and digital signatures. Prior work proposed new machine learning (ML)-based attacks on LWE problems with small, sparse secrets, but these attacks require millions of LWE samples to train on and take days to recover secrets. We propose three key methods -- better preprocessing, angular embeddings and model pre-training -- to improve these attacks, speeding up preprocessing by $25\times$ and improving model sample efficiency by $10\times$. We demonstrate for the first time that pre-training improves and reduces the cost of ML attacks on LWE. Our architecture improvements enable scaling to larger-dimension LWE problems: this work is the first instance of ML attacks recovering sparse binary secrets in dimension $n=1024$, the smallest dimension used in practice for homomorphic encryption applications of LWE where sparse binary secrets are proposed.
LGFeb 29, 2024
Machine learning for modular multiplicationKristin Lauter, Cathy Yuanchen Li, Krystal Maughan et al.
Motivated by cryptographic applications, we investigate two machine learning approaches to modular multiplication: namely circular regression and a sequence-to-sequence transformer model. The limited success of both methods demonstrated in our results gives evidence for the hardness of tasks involving modular multiplication upon which cryptosystems are based.
LGOct 9, 2025
TAPAS: Datasets for Learning the Learning with Errors ProblemEshika Saxena, Alberto Alfarano, François Charton et al.
AI-powered attacks on Learning with Errors (LWE), an important hard math problem in post-quantum cryptography, rival or outperform "classical" attacks on LWE under certain parameter settings. Despite the promise of this approach, a dearth of accessible data limits AI practitioners' ability to study and improve these attacks. Creating LWE data for AI model training is time- and compute-intensive and requires significant domain expertise. To fill this gap and accelerate AI research on LWE attacks, we propose the TAPAS datasets, a Toolkit for Analysis of Post-quantum cryptography using AI Systems. These datasets cover several LWE settings and can be used off-the-shelf by AI practitioners to prototype new approaches to cracking LWE. This work documents TAPAS dataset creation, establishes attack performance baselines, and lays out directions for future work.
CRJul 22, 2021
A Cryptographic Hash Function from Markoff TriplesElena Fuchs, Kristin Lauter, Matthew Litman et al.
Cryptographic hash functions from expander graphs were proposed by Charles, Goren, and Lauter in [CGL] based on the hardness of finding paths in the graph. In this paper, we propose a new candidate for a hash function based on the hardness of finding paths in the graph of Markoff triples modulo p. These graphs have been studied extensively in number theory and various other fields, and yet finding paths in the graphs remains difficult. We discuss the hardness of finding paths between points, based on the structure of the Markoff graphs. We investigate several possible avenues for attack and estimate their running time to be greater than O(p). In particular, we analyze a recent groundbreaking proof in [BGS1] that such graphs are connected and discuss how this proof gives an algorithm for finding paths
CRApr 19, 2021
Secure Human Action Recognition by Encrypted Neural Network InferenceMiran Kim, Xiaoqian Jiang, Kristin Lauter et al.
Advanced computer vision technology can provide near real-time home monitoring to support "aging in place" by detecting falls and symptoms related to seizures and stroke. Affordable webcams, together with cloud computing services (to run machine learning algorithms), can potentially bring significant social benefits. However, it has not been deployed in practice because of privacy concerns. In this paper, we propose a strategy that uses homomorphic encryption to resolve this dilemma, which guarantees information confidentiality while retaining action detection. Our protocol for secure inference can distinguish falls from activities of daily living with 86.21% sensitivity and 99.14% specificity, with an average inference latency of 1.2 seconds and 2.4 seconds on real-world test datasets using small and large neural nets, respectively. We show that our method enables a 613x speedup over the latency-optimized LoLa and achieves an average of 3.1x throughput increase in secure inference compared to the throughput-optimized nGraph-HE2.
CRJul 17, 2020
Computing Blindfolded on Data Homomorphically Encrypted under Multiple Keys: An Extended SurveyAsma Aloufi, Peizhao Hu, Yongsoo Song et al.
New cryptographic techniques such as homomorphic encryption (HE) allow computations to be outsourced to and evaluated blindfolded in a resourceful cloud. These computations often require private data owned by multiple participants, engaging in joint evaluation of some functions. For example, Genome-Wide Association Study (GWAS) is becoming feasible because of recent proliferation of genome sequencing technology. Due to the sensitivity of genomic data, these data should be encrypted using different keys. However, supporting computation on ciphertexts encrypted under multiple keys is a non-trivial task. In this paper, we present a comprehensive survey on different state-of-the-art cryptographic techniques and schemes that are commonly used. We review techniques and schemes including Attribute-Based Encryption (ABE), Proxy Re-Encryption (PRE), Threshold Homomorphic Encryption (ThHE), and Multi-Key Homomorphic Encryption (MKHE). We analyze them based on different system and security models, and examine their complexities. We share lessons learned and draw observations for designing better schemes with reduced overheads.
LGJul 9, 2020
Transparency Tools for Fairness in AI (Luskin)Mingliang Chen, Aria Shahverdi, Sarah Anderson et al.
We propose new tools for policy-makers to use when assessing and correcting fairness and bias in AI algorithms. The three tools are: - A new definition of fairness called "controlled fairness" with respect to choices of protected features and filters. The definition provides a simple test of fairness of an algorithm with respect to a dataset. This notion of fairness is suitable in cases where fairness is prioritized over accuracy, such as in cases where there is no "ground truth" data, only data labeled with past decisions (which may have been biased). - Algorithms for retraining a given classifier to achieve "controlled fairness" with respect to a choice of features and filters. Two algorithms are presented, implemented and tested. These algorithms require training two different models in two stages. We experiment with combinations of various types of models for the first and second stage and report on which combinations perform best in terms of fairness and accuracy. - Algorithms for adjusting model parameters to achieve a notion of fairness called "classification parity". This notion of fairness is suitable in cases where accuracy is prioritized. Two algorithms are presented, one which assumes that protected features are accessible to the model during testing, and one which assumes protected features are not accessible during testing. We evaluate our tools on three different publicly available datasets. We find that the tools are useful for understanding various dimensions of bias, and that in practice the algorithms are effective in starkly reducing a given observed bias when tested on new data.
CRMay 19, 2019
Toward Scalable Fully Homomorphic Encryption Through Light Trusted Computing AssistanceWenhao Wang, Yichen Jiang, Qintao Shen et al.
It has been a long standing problem to securely outsource computation tasks to an untrusted party with integrity and confidentiality guarantees. While fully homomorphic encryption (FHE) is a promising technique that allows computations performed on the encrypted data, it suffers from a significant slow down to the computation. In this paper we propose a hybrid solution that uses the latest hardware Trusted Execution Environments (TEEs) to assist FHE by moving the bootstrapping step, which is one of the major obstacles in designing practical FHE schemes, to a secured SGX enclave. TEEFHE, the hybrid system we designed, makes it possible for homomorphic computations to be performed on smaller ciphertext and secret key, providing better performance and lower memory consumption. We make an effort to mitigate side channel leakages within SGX by making the memory access patterns totally independent from the secret information. The evaluation shows that TEEFHE effectively improves the software only FHE schemes in terms of both time and space.
CRFeb 19, 2019
XONN: XNOR-based Oblivious Deep Neural Network InferenceM. Sadegh Riazi, Mohammad Samragh, Hao Chen et al.
Advancements in deep learning enable cloud servers to provide inference-as-a-service for clients. In this scenario, clients send their raw data to the server to run the deep learning model and send back the results. One standing challenge in this setting is to ensure the privacy of the clients' sensitive data. Oblivious inference is the task of running the neural network on the client's input without disclosing the input or the result to the server. This paper introduces XONN, a novel end-to-end framework based on Yao's Garbled Circuits (GC) protocol, that provides a paradigm shift in the conceptual and practical realization of oblivious inference. In XONN, the costly matrix-multiplication operations of the deep learning model are replaced with XNOR operations that are essentially free in GC. We further provide a novel algorithm that customizes the neural network such that the runtime of the GC protocol is minimized without sacrificing the inference accuracy. We design a user-friendly high-level API for XONN, allowing expression of the deep learning model architecture in an unprecedented level of abstraction. Extensive proof-of-concept evaluation on various neural network architectures demonstrates that XONN outperforms prior art such as Gazelle (USENIX Security'18) by up to 7x, MiniONN (ACM CCS'17) by 93x, and SecureML (IEEE S&P'17) by 37x. State-of-the-art frameworks require one round of interaction between the client and the server for each layer of the neural network, whereas, XONN requires a constant round of interactions for any number of layers in the model. XONN is first to perform oblivious inference on Fitnet architectures with up to 21 layers, suggesting a new level of scalability compared with state-of-the-art. Moreover, we evaluate XONN on four datasets to perform privacy-preserving medical diagnosis.
LGOct 1, 2018
CHET: Compiler and Runtime for Homomorphic Evaluation of Tensor ProgramsRoshan Dathathri, Olli Saarikivi, Hao Chen et al.
Fully Homomorphic Encryption (FHE) refers to a set of encryption schemes that allow computations to be applied directly on encrypted data without requiring a secret key. This enables novel application scenarios where a client can safely offload storage and computation to a third-party cloud provider without having to trust the software and the hardware vendors with the decryption keys. Recent advances in both FHE schemes and implementations have moved such applications from theoretical possibilities into the realm of practicalities. This paper proposes a compact and well-reasoned interface called the Homomorphic Instruction Set Architecture (HISA) for developing FHE applications. Just as the hardware ISA interface enabled hardware advances to proceed independent of software advances in the compiler and language runtimes, HISA decouples compiler optimizations and runtimes for supporting FHE applications from advancements in the underlying FHE schemes. This paper demonstrates the capabilities of HISA by building an end-to-end software stack for evaluating neural network models on encrypted data. Our stack includes an end-to-end compiler, runtime, and a set of optimizations. Our approach shows generated code, on a set of popular neural network architectures, is faster than hand-optimized implementations.
CRJul 9, 2018
Multiparty Non-Interactive Key Exchange and More From Isogenies on Elliptic CurvesDan Boneh, Darren Glass, Daniel Krashen et al.
We describe a framework for constructing an efficient non-interactive key exchange (NIKE) protocol for n parties for any n >= 2. Our approach is based on the problem of computing isogenies between isogenous elliptic curves, which is believed to be difficult. We do not obtain a working protocol because of a missing step that is currently an open mathematical problem. What we need to complete our protocol is an efficient algorithm that takes as input an abelian variety presented as a product of isogenous elliptic curves, and outputs an isomorphism invariant of the abelian variety. Our framework builds a cryptographic invariant map, which is a new primitive closely related to a cryptographic multilinear map, but whose range does not necessarily have a group structure. Nevertheless, we show that a cryptographic invariant map can be used to build several cryptographic primitives, including NIKE, that were previously constructed from multilinear maps and indistinguishability obfuscation.
NTJun 14, 2018
Ramanujan graphs in cryptographyAnamaria Costache, Brooke Feigon, Kristin Lauter et al.
In this paper we study the security of a proposal for Post-Quantum Cryptography from both a number theoretic and cryptographic perspective. Charles-Goren-Lauter in 2006 [CGL06] proposed two hash functions based on the hardness of finding paths in Ramanujan graphs. One is based on Lubotzky-Phillips-Sarnak (LPS) graphs and the other one is based on Supersingular Isogeny Graphs. A 2008 paper by Petit-Lauter-Quisquater breaks the hash function based on LPS graphs. On the Supersingular Isogeny Graphs proposal, recent work has continued to build cryptographic applications on the hardness of finding isogenies between supersingular elliptic curves. A 2011 paper by De Feo-Jao-Plût proposed a cryptographic system based on Supersingular Isogeny Diffie-Hellman as well as a set of five hard problems. In this paper we show that the security of the SIDH proposal relies on the hardness of the SIG path-finding problem introduced in [CGL06]. In addition, similarities between the number theoretic ingredients in the LPS and Pizer constructions suggest that the hardness of the path-finding problem in the two graphs may be linked. By viewing both graphs from a number theoretic perspective, we identify the similarities and differences between the Pizer and LPS graphs.
CROct 10, 2017
Attacks on the Search-RLWE problem with small errorsHao Chen, Kristin Lauter, Katherine E. Stange
The Ring Learning-With-Errors (RLWE) problem shows great promise for post-quantum cryptography and homomorphic encryption. We describe a new attack on the non-dual search RLWE problem with small error widths, using ring homomorphisms to finite fields and the chi-squared statistical test. In particular, we identify a "subfield vulnerability" (Section 5.2) and give a new attack which finds this vulnerability by mapping to a finite field extension and detecting non-uniformity with respect to the number of elements in the subfield. We use this attack to give examples of vulnerable RLWE instances in Galois number fields. We also extend the well-known search-to-decision reduction result to Galois fields with any unramified prime modulus q, regardless of the residue degree f of q, and we use this in our attacks. The time complexity of our attack is O(nq2f), where n is the degree of K and f is the residue degree of q in K. We also show an attack on the non-dual (resp. dual) RLWE problem with narrow error distributions in prime cyclotomic rings when the modulus is a ramified prime (resp. any integer). We demonstrate the attacks in practice by finding many vulnerable instances and successfully attacking them. We include the code for all attacks.
CROct 9, 2017
Security considerations for Galois non-dual RLWE familiesHao Chen, Kristin Lauter, Katherine E. Stange
We explore further the hardness of the non-dual discrete variant of the Ring-LWE problem for various number rings, give improved attacks for certain rings satisfying some additional assumptions, construct a new family of vulnerable Galois number fields, and apply some number theoretic results on Gauss sums to deduce the likely failure of these attacks for 2-power cyclotomic rings and unramified moduli.
ITDec 15, 2016
Variations of the McEliece CryptosystemJessalyn Bolkema, Heide Gluesing-Luerssen, Christine A. Kelley et al.
Two variations of the McEliece cryptosystem are presented. The first one is based on a relaxation of the column permutation in the classical McEliece scrambling process. This is done in such a way that the Hamming weight of the error, added in the encryption process, can be controlled so that efficient decryption remains possible. The second variation is based on the use of spatially coupled moderate-density parity-check codes as secret codes. These codes are known for their excellent error-correction performance and allow for a relatively low key size in the cryptosystem. For both variants the security with respect to known attacks is discussed.
LGDec 18, 2014
Crypto-Nets: Neural Networks over Encrypted DataPengtao Xie, Misha Bilenko, Tom Finley et al.
The problem we address is the following: how can a user employ a predictive model that is held by a third party, without compromising private information. For example, a hospital may wish to use a cloud service to predict the readmission risk of a patient. However, due to regulations, the patient's medical files cannot be revealed. The goal is to make an inference using the model, without jeopardizing the accuracy of the prediction or the privacy of the data. To achieve high accuracy, we use neural networks, which have been shown to outperform other learning models for many tasks. To achieve the privacy requirements, we use homomorphic encryption in the following protocol: the data owner encrypts the data and sends the ciphertexts to the third party to obtain a prediction from a trained model. The model operates on these ciphertexts and sends back the encrypted prediction. In this protocol, not only the data remains private, even the values predicted are available only to the data owner. Using homomorphic encryption and modifications to the activation functions and training algorithms of neural networks, we show that it is protocol is possible and may be feasible. This method paves the way to build a secure cloud-based neural network prediction services without invading users' privacy.