h-index72
31papers
1,372citations
Novelty56%
AI Score60

31 Papers

MLFeb 2, 2023
On the Efficacy of Differentially Private Few-shot Image Classification

Marlon Tobaben, Aliaksandra Shysheya, John Bronskill et al.

There has been significant recent progress in training differentially private (DP) models which achieve accuracy that approaches the best non-private models. These DP models are typically pretrained on large public datasets and then fine-tuned on private downstream datasets that are relatively large and similar in distribution to the pretraining data. However, in many applications including personalization and federated learning, it is crucial to perform well (i) in the few-shot setting, as obtaining large amounts of labeled data may be problematic; and (ii) on datasets from a wide variety of domains for use in various specialist settings. To understand under which conditions few-shot DP can be effective, we perform an exhaustive set of experiments that reveals how the accuracy and vulnerability to attack of few-shot DP image classification models are affected as the number of shots per class, privacy level, model architecture, downstream dataset, and subset of learnable parameters in the model vary. We show that to achieve DP accuracy on par with non-private models, the shots per class must be increased as the privacy level increases. We also show that learning parameter-efficient FiLM adapters under DP is competitive with learning just the final classifier layer or learning all of the network parameters. Finally, we evaluate DP federated learning systems and establish state-of-the-art performance on the challenging FLAIR benchmark.

LGDec 21, 2022
SoK: Let the Privacy Games Begin! A Unified Treatment of Data Inference Privacy in Machine Learning

Ahmed Salem, Giovanni Cherubin, David Evans et al.

Deploying machine learning models in production may allow adversaries to infer sensitive information about training data. There is a vast literature analyzing different types of inference risks, ranging from membership inference to reconstruction attacks. Inspired by the success of games (i.e., probabilistic experiments) to study security properties in cryptography, some authors describe privacy inference risks in machine learning using a similar game-based style. However, adversary capabilities and goals are often stated in subtly different ways from one presentation to the other, which makes it hard to relate and compose results. In this paper, we present a game-based framework to systematize the body of knowledge on privacy inference risks in machine learning. We use this framework to (1) provide a unifying structure for definitions of inference risks, (2) formally establish known relations among definitions, and (3) to uncover hitherto unknown relations that would have been difficult to spot otherwise.

LGJun 10, 2022
Bayesian Estimation of Differential Privacy

Santiago Zanella-Béguelin, Lukas Wutschitz, Shruti Tople et al.

Algorithms such as Differentially Private SGD enable training machine learning models with formal privacy guarantees. However, there is a discrepancy between the protection that such algorithms guarantee in theory and the protection they afford in practice. An emerging strand of work empirically estimates the protection afforded by differentially private training as a confidence interval for the privacy budget $\varepsilon$ spent on training a model. Existing approaches derive confidence intervals for $\varepsilon$ from confidence intervals for the false positive and false negative rates of membership inference attacks. Unfortunately, obtaining narrow high-confidence intervals for $ε$ using this method requires an impractically large sample size and training as many models as samples. We propose a novel Bayesian method that greatly reduces sample size, and adapt and validate a heuristic to draw more than one sample per trained model. Our Bayesian method exploits the hypothesis testing interpretation of differential privacy to obtain a posterior for $\varepsilon$ (not just a confidence interval) from the joint posterior of the false positive and false negative rates of membership inference attacks. For the same sample size and confidence, we derive confidence intervals for $\varepsilon$ around 40% narrower than prior work. The heuristic, which we adapt from label-only DP, can be used to further reduce the number of trained models needed to get enough samples by up to 2 orders of magnitude.

LGNov 27, 2023
Rethinking Privacy in Machine Learning Pipelines from an Information Flow Control Perspective

Lukas Wutschitz, Boris Köpf, Andrew Paverd et al.

Modern machine learning systems use models trained on ever-growing corpora. Typically, metadata such as ownership, access control, or licensing information is ignored during training. Instead, to mitigate privacy risks, we rely on generic techniques such as dataset sanitization and differentially private model training, with inherent privacy/utility trade-offs that hurt model performance. Moreover, these techniques have limitations in scenarios where sensitive information is shared across multiple participants and fine-grained access control is required. By ignoring metadata, we therefore miss an opportunity to better address security, privacy, and confidentiality challenges. In this paper, we take an information flow control perspective to describe machine learning systems, which allows us to leverage metadata such as access control policies and define clear-cut privacy and confidentiality guarantees with interpretable information flows. Under this perspective, we contrast two different approaches to achieve user-level non-interference: 1) fine-tuning per-user models, and 2) retrieval augmented models that access user-specific datasets at inference time. We compare these two approaches to a trivially non-interfering zero-shot baseline using a public model and to a baseline that fine-tunes this model on the whole corpus. We evaluate trained models on two datasets of scientific articles and demonstrate that retrieval augmented architectures deliver the best utility, scalability, and flexibility while satisfying strict non-interference guarantees.

LGFeb 9Code
Stateless Yet Not Forgetful: Implicit Memory as a Hidden Channel in LLMs

Ahmed Salem, Andrew Paverd, Sahar Abdelnabi

Large language models (LLMs) are commonly treated as stateless: once an interaction ends, no information is assumed to persist unless it is explicitly stored and re-supplied. We challenge this assumption by introducing implicit memory-the ability of a model to carry state across otherwise independent interactions by encoding information in its own outputs and later recovering it when those outputs are reintroduced as input. This mechanism does not require any explicit memory module, yet it creates a persistent information channel across inference requests. As a concrete demonstration, we introduce a new class of temporal backdoors, which we call time bombs. Unlike conventional backdoors that activate on a single trigger input, time bombs activate only after a sequence of interactions satisfies hidden conditions accumulated via implicit memory. We show that such behavior can be induced today through straightforward prompting or fine-tuning. Beyond this case study, we analyze broader implications of implicit memory, including covert inter-agent communication, benchmark contamination, targeted manipulation, and training-data poisoning. Finally, we discuss detection challenges and outline directions for stress-testing and evaluation, with the goal of anticipating and controlling future developments. To promote future research, we release code and data at: https://github.com/microsoft/implicitMemory.

CRMay 29, 2025Code
Securing AI Agents with Information-Flow Control

Manuel Costa, Boris Köpf, Aashish Kolluri et al.

As AI agents become increasingly autonomous and capable, ensuring their security against vulnerabilities such as prompt injection becomes critical. This paper explores the use of information-flow control (IFC) to provide security guarantees for AI agents. We present a formal model to reason about the security and expressiveness of agent planners. Using this model, we characterize the class of properties enforceable by dynamic taint-tracking and construct a taxonomy of tasks to evaluate security and utility trade-offs of planner designs. Informed by this exploration, we present Fides, a planner that tracks confidentiality and integrity labels, deterministically enforces security policies, and introduces novel primitives for selectively hiding information. Its evaluation in AgentDojo demonstrates that this approach enables us to complete a broad range of tasks with security guarantees. A tutorial to walk readers through the the concepts introduced in the paper can be found at https://github.com/microsoft/fides

CRMay 14
MetaBackdoor: Exploiting Positional Encoding as a Backdoor Attack Surface in LLMs

Rui Wen, Mark Russinovich, Andrew Paverd et al.

Backdoor attacks pose a serious security threat to large language models (LLMs), which are increasingly deployed as general-purpose assistants in safety- and privacy-critical applications. Existing LLM backdoors rely primarily on content-based triggers, requiring explicit modification of the input text. In this work, we show that this assumption is unnecessary and limiting. We introduce MetaBackdoor, a new class of backdoor attacks that exploits positional information as the trigger, without modifying textual content. Our key insight is that Transformer-based LLMs necessarily encode token positions to process ordered sequences. As a result, length-correlated positional structure is reflected in the model's internal computation and can be used as an effective non-content trigger signal. We demonstrate that even a simple length-based positional trigger is sufficient to activate stealthy backdoors. Unlike prior attacks, MetaBackdoor operates on visibly and semantically clean inputs and enables qualitatively new capabilities. We show that a backdoored LLM can be induced to disclose sensitive internal information, including proprietary system prompts, once a length condition is satisfied. We further demonstrate a self-activation scenario, where normal multi-turn interaction can move the conversation context into the trigger region and induce malicious tool-call behavior without attacker-supplied trigger text. In addition, MetaBackdoor is orthogonal to content-based backdoors and can be composed with them to create more precise and harder-to-detect activation conditions. Our results expand the threat model of LLM backdoors by revealing positional encoding as a previously overlooked attack surface. This challenges defenses that focus on detecting suspicious text and highlights the need for new defense strategies that explicitly account for positional triggers in modern LLM architectures.

CRSep 25, 2019Code
PDoT: Private DNS-over-TLS with TEE Support

Yoshimichi Nakatsuka, Andrew Paverd, Gene Tsudik

Security and privacy of the Internet Domain Name System (DNS) have been longstanding concerns. Recently, there is a trend to protect DNS traffic using Transport Layer Security (TLS). However, at least two major issues remain: (1) how do clients authenticate DNS-over-TLS endpoints in a scalable and extensible manner; and (2) how can clients trust endpoints to behave as expected? In this paper, we propose a novel Private DNS-over-TLS (PDoT ) architecture. PDoT includes a DNS Recursive Resolver (RecRes) that operates within a Trusted Execution Environment (TEE). Using Remote Attestation, DNS clients can authenticate, and receive strong assurance of trustworthiness of PDoT RecRes. We provide an open-source proof-of-concept implementation of PDoT and use it to experimentally demonstrate that its latency and throughput match that of the popular Unbound DNS-over-TLS resolver.

CROct 14, 2018Code
S-FaaS: Trustworthy and Accountable Function-as-a-Service using Intel SGX

Fritz Alder, N. Asokan, Arseny Kurnikov et al.

Function-as-a-Service (FaaS) is a recent and already very popular paradigm in cloud computing. The function provider need only specify the function to be run, usually in a high-level language like JavaScript, and the service provider orchestrates all the necessary infrastructure and software stacks. The function provider is only billed for the actual computational resources used by the function invocation. Compared to previous cloud paradigms, FaaS requires significantly more fine-grained resource measurement mechanisms, e.g. to measure compute time and memory usage of a single function invocation with sub-second accuracy. Thanks to the short duration and stateless nature of functions, and the availability of multiple open-source frameworks, FaaS enables non-traditional service providers e.g. individuals or data centers with spare capacity. However, this exacerbates the challenge of ensuring that resource consumption is measured accurately and reported reliably. It also raises the issues of ensuring computation is done correctly and minimizing the amount of information leaked to service providers. To address these challenges, we introduce S-FaaS, the first architecture and implementation of FaaS to provide strong security and accountability guarantees backed by Intel SGX. To match the dynamic event-driven nature of FaaS, our design introduces a new key distribution enclave and a novel transitive attestation protocol. A core contribution of S-FaaS is our set of resource measurement mechanisms that securely measure compute time inside an enclave, and actual memory allocations. We have integrated S-FaaS into the popular OpenWhisk FaaS framework. We evaluate the security of our architecture, the accuracy of our resource measurement mechanisms, and the performance of our implementation, showing that our resource measurement mechanisms add less than 6.3% latency on standardized benchmarks.

CRSep 5, 2017Code
SafeKeeper: Protecting Web Passwords using Trusted Execution Environments

Klaudia Krawiecka, Arseny Kurnikov, Andrew Paverd et al.

Passwords are undoubtedly the most dominant user authentication mechanism on the web today. Although they are inexpensive and easy-to-use, security concerns of password-based authentication are serious. Phishing and theft of password databases are two critical concerns. The tendency of users to re-use passwords across different services exacerbates the impact of these two concerns. Current solutions addressing these concerns are not fully satisfactory: they typically address only one of the two concerns; they do not protect passwords from rogue servers; they do not provide any verifiable evidence of their (server-side) adoption to users; and they face deployability challenges in terms of the cost for service providers and/or ease-of-use for end users. We present SafeKeeper, a comprehensive approach to protect the confidentiality of passwords in web authentication systems. Unlike previous approaches, SafeKeeper protects user passwords against very strong adversaries, including rogue servers and sophisticated external phishers. It is relatively inexpensive to deploy as it (i) uses widely available hardware security mechanisms like Intel SGX, (ii) is integrated into popular web platforms like WordPress, and (iii) has small performance overhead. We describe a variety of challenges in designing and implementing such a system, and how we overcome them. Through an 86-participant user study, and systematic analysis and experiments, we demonstrate the usability, security and deployability of SafeKeeper, which is available as open-source.

CRNov 6, 2015Code
OmniShare: Securely Accessing Encrypted Cloud Storage from Multiple Authorized Devices

Andrew Paverd, Sandeep Tamrakar, Hoang Long Nguyen et al.

Cloud storage services like Dropbox and Google Drive are widely used by individuals and businesses. Two attractive features of these services are 1) the automatic synchronization of files between multiple client devices and 2) the possibility to share files with other users. However, privacy of cloud data is a growing concern for both individuals and businesses. Encrypting data on the client-side before uploading it is an effective privacy safeguard, but it requires all client devices to have the decryption key. Current solutions derive these keys solely from user-chosen passwords, which have low entropy and are easily guessed. We present OmniShare, the first scheme to allow client-side encryption with high-entropy keys whilst providing an intuitive key distribution mechanism to enable access from multiple client devices. Instead of passwords, we use low bandwidth uni-directional out-of-band (OOB) channels, such as QR codes, to authenticate new devices. To complement these OOB channels, the cloud storage itself is used as a communication channel between devices in our protocols. We rely on a directory-based key hierarchy with individual file keys to limit the consequences of key compromise and allow efficient sharing of files without requiring re-encryption. OmniShare is open source software and currently available for Android and Windows with other platforms in development. We describe the design and implementation of OmniShare, and explain how we evaluated its security using formal methods, its performance via real-world benchmarks, and its usability through a cognitive walkthrough.

CRDec 12, 2023
Maatphor: Automated Variant Analysis for Prompt Injection Attacks

Ahmed Salem, Andrew Paverd, Boris Köpf

Prompt injection has emerged as a serious security threat to large language models (LLMs). At present, the current best-practice for defending against newly-discovered prompt injection techniques is to add additional guardrails to the system (e.g., by updating the system prompt or using classifiers on the input and/or output of the model.) However, in the same way that variants of a piece of malware are created to evade anti-virus software, variants of a prompt injection can be created to evade the LLM's guardrails. Ideally, when a new prompt injection technique is discovered, candidate defenses should be tested not only against the successful prompt injection, but also against possible variants. In this work, we present, a tool to assist defenders in performing automated variant analysis of known prompt injection attacks. This involves solving two main challenges: (1) automatically generating variants of a given prompt according, and (2) automatically determining whether a variant was effective based only on the output of the model. This tool can also assist in generating datasets for jailbreak and prompt injection attacks, thus overcoming the scarcity of data in this domain. We evaluate Maatphor on three different types of prompt injection tasks. Starting from an ineffective (0%) seed prompt, Maatphor consistently generates variants that are at least 60% effective within the first 40 iterations.

LGJun 10, 2025
Design Patterns for Securing LLM Agents against Prompt Injections

Luca Beurer-Kellner, Beat Buesser, Ana-Maria Creţu et al. · eth-zurich

As AI agents powered by Large Language Models (LLMs) become increasingly versatile and capable of addressing a broad spectrum of tasks, ensuring their security has become a critical challenge. Among the most pressing threats are prompt injection attacks, which exploit the agent's resilience on natural language inputs -- an especially dangerous threat when agents are granted tool access or handle sensitive information. In this work, we propose a set of principled design patterns for building AI agents with provable resistance to prompt injection. We systematically analyze these patterns, discuss their trade-offs in terms of utility and security, and illustrate their real-world applicability through a series of case studies.

CRJun 11, 2025
LLMail-Inject: A Dataset from a Realistic Adaptive Prompt Injection Challenge

Sahar Abdelnabi, Aideen Fay, Ahmed Salem et al.

Indirect Prompt Injection attacks exploit the inherent limitation of Large Language Models (LLMs) to distinguish between instructions and data in their inputs. Despite numerous defense proposals, the systematic evaluation against adaptive adversaries remains limited, even when successful attacks can have wide security and privacy implications, and many real-world LLM-based applications remain vulnerable. We present the results of LLMail-Inject, a public challenge simulating a realistic scenario in which participants adaptively attempted to inject malicious instructions into emails in order to trigger unauthorized tool calls in an LLM-based email assistant. The challenge spanned multiple defense strategies, LLM architectures, and retrieval configurations, resulting in a dataset of 208,095 unique attack submissions from 839 participants. We release the challenge code, the full dataset of submissions, and our analysis demonstrating how this data can provide new insights into the instruction-data separation problem. We hope this will serve as a foundation for future research towards practical structural solutions to prompt injection.

CRFeb 22, 2024
Closed-Form Bounds for DP-SGD against Record-level Inference

Giovanni Cherubin, Boris Köpf, Andrew Paverd et al.

Machine learning models trained with differentially-private (DP) algorithms such as DP-SGD enjoy resilience against a wide range of privacy attacks. Although it is possible to derive bounds for some attacks based solely on an $(\varepsilon,δ)$-DP guarantee, meaningful bounds require a small enough privacy budget (i.e., injecting a large amount of noise), which results in a large loss in utility. This paper presents a new approach to evaluate the privacy of machine learning models against specific record-level threats, such as membership and attribute inference, without the indirection through DP. We focus on the popular DP-SGD algorithm, and derive simple closed-form bounds. Our proofs model DP-SGD as an information theoretic channel whose inputs are the secrets that an attacker wants to infer (e.g., membership of a data record) and whose outputs are the intermediate model parameters produced by iterative optimization. We obtain bounds for membership inference that match state-of-the-art techniques, whilst being orders of magnitude faster to compute. Additionally, we present a novel data-dependent bound against attribute inference. Our results provide a direct, interpretable, and practical way to evaluate the privacy of trained models against specific inference threats without sacrificing utility.

CLAug 4, 2025
Highlight & Summarize: RAG without the jailbreaks

Giovanni Cherubin, Andrew Paverd

Preventing jailbreaking and model hijacking of Large Language Models (LLMs) is an important yet challenging task. For example, when interacting with a chatbot, malicious users can input specially crafted prompts to cause the LLM to generate undesirable content or perform a completely different task from its intended purpose. Existing mitigations for such attacks typically rely on hardening the LLM's system prompt or using a content classifier trained to detect undesirable content or off-topic conversations. However, these probabilistic approaches are relatively easy to bypass due to the very large space of possible inputs and undesirable outputs. In this paper, we present and evaluate Highlight & Summarize (H&S), a new design pattern for retrieval-augmented generation (RAG) systems that prevents these attacks by design. The core idea is to perform the same task as a standard RAG pipeline (i.e., to provide natural language answers to questions, based on relevant sources) without ever revealing the user's question to the generative LLM. This is achieved by splitting the pipeline into two components: a highlighter, which takes the user's question and extracts relevant passages ("highlights") from the retrieved documents, and a summarizer, which takes the highlighted passages and summarizes them into a cohesive answer. We describe several possible instantiations of H&S and evaluate their generated responses in terms of correctness, relevance, and response quality. Surprisingly, when using an LLM-based highlighter, the majority of H&S responses are judged to be better than those of a standard RAG pipeline.

CRDec 13, 2024
VerifiableFL: Verifiable Claims for Federated Learning using Exclaves

Jinnan Guo, Kapil Vaswani, Andrew Paverd et al.

In federated learning (FL), data providers jointly train a machine learning model without sharing their training data. This makes it challenging to provide verifiable claims about properties of the final trained FL model, e.g., related to the employed training data, the used data sanitization, or the correct training algorithm -- a malicious data provider can simply deviate from the correct training protocol without being detected. While prior FL training systems have explored the use of trusted execution environments (TEEs) to combat such attacks, existing approaches struggle to link attestation proofs from TEEs robustly and effectively with claims about the trained FL model. TEEs have also been shown to suffer from a wide range of attacks, including side-channel attacks. We describe VerifiableFL, a system for training FL models that provides verifiable claims about trained models with the help of runtime attestation proofs. VerifiableFL generates such proofs using the new abstraction of exclaves, which are integrity-only execution environments without any secrets, thus making them immune to data leakage attacks. Whereas previous approaches only attested whole TEEs statically, i.e., at deployment time, VerifiableFL uses exclaves to attest individual data transformations during FL training. These runtime attestation proofs then form an attested dataflow graph of the entire FL model training computation. The graph can be checked by an auditor to ensure that the trained FL model satisfies its verifiable claims, such as the use of particular data sanitization by data providers or aggregation strategy by the model provider. We implement VerifiableFL by extending NVIDIA's NVFlare FL framework to use exclaves, and show that VerifiableFL introduces less than 10% overhead compared to unprotected FL model training.

CRJun 2, 2024
Get my drift? Catching LLM Task Drift with Activation Deltas

Sahar Abdelnabi, Aideen Fay, Giovanni Cherubin et al.

LLMs are commonly used in retrieval-augmented applications to execute user instructions based on data from external sources. For example, modern search engines use LLMs to answer queries based on relevant search results; email plugins summarize emails by processing their content through an LLM. However, the potentially untrusted provenance of these data sources can lead to prompt injection attacks, where the LLM is manipulated by natural language instructions embedded in the external data, causing it to deviate from the user's original instruction(s). We define this deviation as task drift. Task drift is a significant concern as it allows attackers to exfiltrate data or influence the LLM's output for other users. We study LLM activations as a solution to detect task drift, showing that activation deltas - the difference in activations before and after processing external data - are strongly correlated with this phenomenon. Through two probing methods, we demonstrate that a simple linear classifier can detect drift with near-perfect ROC AUC on an out-of-distribution test set. We evaluate these methods by making minimal assumptions about how users' tasks, system prompts, and attacks can be phrased. We observe that this approach generalizes surprisingly well to unseen task domains, such as prompt injections, jailbreaks, and malicious instructions, without being trained on any of these attacks. Interestingly, the fact that this solution does not require any modifications to the LLM (e.g., fine-tuning), as well as its compatibility with existing meta-prompting solutions, makes it cost-efficient and easy to deploy. To encourage further research on activation-based task inspection, decoding, and interpretability, we release our large-scale TaskTracker toolkit, featuring a dataset of over 500K instances, representations from six SoTA language models, and a suite of inspection tools.

CRMay 14, 2021
VICEROY: GDPR-/CCPA-compliant Enforcement of Verifiable Accountless Consumer Requests

Scott Jordan, Yoshimichi Nakatsuka, Ercan Ozturk et al.

Recent data protection regulations (such as GDPR and CCPA) grant consumers various rights, including the right to access, modify or delete any personal information collected about them (and retained) by a service provider. To exercise these rights, one must submit a verifiable consumer request proving that the collected data indeed pertains to them. This action is straightforward for consumers with active accounts with a service provider at the time of data collection, since they can use standard (e.g., password-based) means of authentication to validate their requests. However, a major conundrum arises from the need to support consumers without accounts to exercise their rights. To this end, some service providers began requiring such accountless consumers to reveal and prove their identities (e.g., using government-issued documents, utility bills, or credit card numbers) as part of issuing a verifiable consumer request. While understandable as a short-term cure, this approach is cumbersome and expensive for service providers as well as privacy-invasive for consumers. Consequently, there is a strong need to provide better means of authenticating requests from accountless consumers. To achieve this, we propose VICEROY, a privacy-preserving and scalable framework for producing proofs of data ownership, which form a basis for verifiable consumer requests. Building upon existing web techniques and features, VICEROY allows accountless consumers to interact with service providers, and later prove that they are the same person in a privacy-preserving manner, while requiring minimal changes for both parties. We design and implement VICEROY with emphasis on security/privacy, deployability and usability. We also thoroughly assess its practicality via extensive experiments.

CRJul 20, 2020
CACTI: Captcha Avoidance via Client-side TEE Integration

Yoshimichi Nakatsuka, Ercan Ozturk, Andrew Paverd et al.

Preventing abuse of web services by bots is an increasingly important problem, as abusive activities grow in both volume and variety. CAPTCHAs are the most common way for thwarting bot activities. However, they are often ineffective against bots and frustrating for humans. In addition, some recent CAPTCHA techniques diminish user privacy. Meanwhile, client-side Trusted Execution Environments (TEEs) are becoming increasingly widespread (notably, ARM TrustZone and Intel SGX), allowing establishment of trust in a small part (trust anchor or TCB) of client-side hardware. This prompts the question: can a TEE help reduce (or remove entirely) user burden of solving CAPTCHAs? In this paper, we design CACTI: CAPTCHA Avoidance via Client-side TEE Integration. Using client-side TEEs, CACTI allows legitimate clients to generate unforgeable rate-proofs demonstrating how frequently they have performed specific actions. These rate-proofs can be sent to web servers in lieu of solving CAPTCHAs. CACTI provides strong client privacy guarantees, since the information is only sent to the visited website and authenticated using a group signature scheme. Our evaluations show that overall latency of generating and verifying a CACTI rate-proof is less than 0.25 sec, while CACTI's bandwidth overhead is over 98% lower than that of current CAPTCHA systems.

LGDec 17, 2019
Analyzing Information Leakage of Updates to Natural Language Models

Santiago Zanella-Béguelin, Lukas Wutschitz, Shruti Tople et al.

To continuously improve quality and reflect changes in data, machine learning applications have to regularly retrain and update their core models. We show that a differential analysis of language model snapshots before and after an update can reveal a surprising amount of detailed information about changes in the training data. We propose two new metrics---\emph{differential score} and \emph{differential rank}---for analyzing the leakage due to updates of natural language models. We perform leakage analysis using these metrics across models trained on several different datasets using different methods and configurations. We discuss the privacy implications of our findings, propose mitigation strategies and evaluate their effect.

CRAug 20, 2018
Mitigating Branch-Shadowing Attacks on Intel SGX using Control Flow Randomization

Shohreh Hosseinzadeh, Hans Liljestrand, Ville Leppänen et al.

Intel Software Guard Extensions (SGX) is a promising hardware-based technology for protecting sensitive computations from potentially compromised system software. However, recent research has shown that SGX is vulnerable to branch-shadowing -- a side channel attack that leaks the fine-grained (branch granularity) control flow of an enclave (SGX protected code), potentially revealing sensitive data to the attacker. The previously-proposed defense mechanism, called Zigzagger, attempted to hide the control flow, but has been shown to be ineffective if the attacker can single-step through the enclave using the recent SGX-Step framework. Taking into account these stronger attacker capabilities, we propose a new defense against branch-shadowing, based on control flow randomization. Our scheme is inspired by Zigzagger, but provides quantifiable security guarantees with respect to a tunable security parameter. Specifically, we eliminate conditional branches and hide the targets of unconditional branches using a combination of compile-time modifications and run-time code randomization. We evaluated the performance of our approach by measuring the run-time overhead of ten benchmark programs of SGX-Nbench in SGX environment.

CRApr 23, 2018
Keys in the Clouds: Auditable Multi-device Access to Cryptographic Credentials

Arseny Kurnikov, Andrew Paverd, Mohammad Mannan et al.

Personal cryptographic keys are the foundation of many secure services, but storing these keys securely is a challenge, especially if they are used from multiple devices. Storing keys in a centralized location, like an Internet-accessible server, raises serious security concerns (e.g. server compromise). Hardware-based Trusted Execution Environments (TEEs) are a well-known solution for protecting sensitive data in untrusted environments, and are now becoming available on commodity server platforms. Although the idea of protecting keys using a server-side TEE is straight-forward, in this paper we validate this approach and show that it enables new desirable functionality. We describe the design, implementation, and evaluation of a TEE-based Cloud Key Store (CKS), an online service for securely generating, storing, and using personal cryptographic keys. Using remote attestation, users receive strong assurance about the behaviour of the CKS, and can authenticate themselves using passwords while avoiding typical risks of password-based authentication like password theft or phishing. In addition, this design allows users to i) define policy-based access controls for keys; ii) delegate keys to other CKS users for a specified time and/or a limited number of uses; and iii) audit all key usages via a secure audit log. We have implemented a proof of concept CKS using Intel SGX and integrated this into GnuPG on Linux and OpenKeychain on Android. Our CKS implementation performs approximately 6,000 signature operations per second on a single desktop PC. The latency is in the same order of magnitude as using locally-stored keys, and 20x faster than smart cards.

CRMar 29, 2018
Migrating SGX Enclaves with Persistent State

Fritz Alder, Arseny Kurnikov, Andrew Paverd et al.

Hardware-supported security mechanisms like Intel Software Guard Extensions (SGX) provide strong security guarantees, which are particularly relevant in cloud settings. However, their reliance on physical hardware conflicts with cloud practices, like migration of VMs between physical platforms. For instance, the SGX trusted execution environment (enclave) is bound to a single physical CPU. Although prior work has proposed an effective mechanism to migrate an enclave's data memory, it overlooks the migration of persistent state, including sealed data and monotonic counters; the former risks data loss whilst the latter undermines the SGX security guarantees. We show how this can be exploited to mount attacks, and then propose an improved enclave migration approach guaranteeing the consistency of persistent state. Our software-only approach enables migratable sealed data and monotonic counters, maintains all SGX security guarantees, minimizes developer effort, and incurs negligible performance overhead.

CROct 17, 2017
Towards Linux Kernel Memory Safety

Elena Reshetova, Hans Liljestrand, Andrew Paverd et al.

The security of billions of devices worldwide depends on the security and robustness of the mainline Linux kernel. However, the increasing number of kernel-specific vulnerabilities, especially memory safety vulnerabilities, shows that the kernel is a popular and practically exploitable target. Two major causes of memory safety vulnerabilities are reference counter overflows (temporal memory errors) and lack of pointer bounds checking (spatial memory errors). To succeed in practice, security mechanisms for critical systems like the Linux kernel must also consider performance and deployability as critical design objectives. We present and systematically analyze two such mechanisms for improving memory safety in the Linux kernel: (a) an overflow-resistant reference counter data structure designed to accommodate typical reference counter usage in kernel source code, and (b) runtime pointer bounds checking using Intel MPX in the kernel.

CRJun 12, 2017
LO-FAT: Low-Overhead Control Flow ATtestation in Hardware

Ghada Dessouky, Shaza Zeitouni, Thomas Nyman et al.

Attacks targeting software on embedded systems are becoming increasingly prevalent. Remote attestation is a mechanism that allows establishing trust in embedded devices. However, existing attestation schemes are either static and cannot detect control-flow attacks, or require instrumentation of software incurring high performance overheads. To overcome these limitations, we present LO-FAT, the first practical hardware-based approach to control-flow attestation. By leveraging existing processor hardware features and commonly-used IP blocks, our approach enables efficient control-flow attestation without requiring software instrumentation. We show that our proof-of-concept implementation based on a RISC-V SoC incurs no processor stalls and requires reasonable area overhead.

CRMay 29, 2017
HardScope: Thwarting DOP with Hardware-assisted Run-time Scope Enforcement

Thomas Nyman, Ghada Dessouky, Shaza Zeitouni et al.

Widespread use of memory unsafe programming languages (e.g., C and C++) leaves many systems vulnerable to memory corruption attacks. A variety of defenses have been proposed to mitigate attacks that exploit memory errors to hijack the control flow of the code at run-time, e.g., (fine-grained) randomization or Control Flow Integrity. However, recent work on data-oriented programming (DOP) demonstrated highly expressive (Turing-complete) attacks, even in the presence of these state-of-the-art defenses. Although multiple real-world DOP attacks have been demonstrated, no efficient defenses are yet available. We propose run-time scope enforcement (RSE), a novel approach designed to efficiently mitigate all currently known DOP attacks by enforcing compile-time memory safety constraints (e.g., variable visibility rules) at run-time. We present HardScope, a proof-of-concept implementation of hardware-assisted RSE for the new RISC-V open instruction set architecture. We discuss our systematic empirical evaluation of HardScope which demonstrates that it can mitigate all currently known DOP attacks, and has a real-world performance overhead of 3.2% in embedded benchmarks.

CRApr 24, 2017
Formal Analysis of V2X Revocation Protocols

Jorden Whitefield, Liqun Chen, Frank Kargl et al.

Research on vehicular networking (V2X) security has produced a range of security mechanisms and protocols tailored for this domain, addressing both security and privacy. Typically, the security analysis of these proposals has largely been informal. However, formal analysis can be used to expose flaws and ultimately provide a higher level of assurance in the protocols. This paper focusses on the formal analysis of a particular element of security mechanisms for V2X found in many proposals: the revocation of malicious or misbehaving vehicles from the V2X system by invalidating their credentials. This revocation needs to be performed in an unlinkable way for vehicle privacy even in the context of vehicles regularly changing their pseudonyms. The REWIRE scheme by Forster et al. and its subschemes BASIC and RTOKEN aim to solve this challenge by means of cryptographic solutions and trusted hardware. Formal analysis using the TAMARIN prover identifies two flaws with some of the functional correctness and authentication properties in these schemes. We then propose Obscure Token (OTOKEN), an extension of REWIRE to enable revocation in a privacy preserving manner. Our approach addresses the functional and authentication properties by introducing an additional key-pair, which offers a stronger and verifiable guarantee of successful revocation of vehicles without resolving the long-term identity. Moreover OTOKEN is the first V2X revocation protocol to be co-designed with a formal model.

CRMar 10, 2017
Security in Automotive Networks: Lightweight Authentication and Authorization

Philipp Mundhenk, Andrew Paverd, Artur Mrowca et al.

With the increasing amount of interconnections between vehicles, the attack surface of internal vehicle networks is rising steeply. Although these networks are shielded against external attacks, they often do not have any internal security to protect against malicious components or adversaries who breach the network perimeter. To secure the in-vehicle network, all communicating components must be authenticated, and only authorized components should be allowed to send and receive messages. This is achieved using an authentication framework. Cryptography is widely used to authenticate communicating parties and provide secure communication channels (e.g., Internet communication). However, the real-time performance requirements of in-vehicle networks restrict the types of cryptographic algorithms and protocols that may be used. In particular, asymmetric cryptography is computationally infeasible during vehicle operation. In this work, we address the challenges of designing authentication protocols for automotive systems. We present Lightweight Authentication for Secure Automotive Networks (LASAN), a full lifecycle authentication approach. We describe the core LASAN protocols and show how they protect the internal vehicle network while complying with the real-time constraints and low computational resources of this domain. Unlike previous work, we also explain how this framework can be integrated into all aspects of the automotive lifecycle, including manufacturing, vehicle maintenance, and software updates. We evaluate LASAN in two different ways: First, we analyze the security properties of the protocols using established protocol verification techniques based on formal methods. Second, we evaluate the timing requirements of LASAN and compare these to other frameworks using a new highly modular discrete event simulator for in-vehicle networks, which we have developed for this evaluation.

CRJun 6, 2016
The Circle Game: Scalable Private Membership Test Using Trusted Hardware

Sandeep Tamrakar, Jian Liu, Andrew Paverd et al.

Malware checking is changing from being a local service to a cloud-assisted one where users' devices query a cloud server, which hosts a dictionary of malware signatures, to check if particular applications are potentially malware. Whilst such an architecture gains all the benefits of cloud-based services, it opens up a major privacy concern since the cloud service can infer personal traits of the users based on the lists of applications queried by their devices. Private membership test (PMT) schemes can remove this privacy concern. However, known PMT schemes do not scale well to a large number of simultaneous users and high query arrival rates. We propose a simple PMT approach using a carousel: circling the entire dictionary through trusted hardware on the cloud server. Users communicate with the trusted hardware via secure channels. We show how the carousel approach, using different data structures to represent the dictionary, can be realized on two different commercial hardware security architectures (ARM TrustZone and Intel SGX). We highlight subtle aspects of securely implementing seemingly simple PMT schemes on these architectures. Through extensive experimental analysis, we show that for the malware checking scenario our carousel approach surprisingly outperforms Path ORAM on the same hardware by supporting a much higher query arrival rate while guaranteeing acceptable response latency for individual queries.

CRMay 25, 2016
C-FLAT: Control-FLow ATtestation for Embedded Systems Software

Tigist Abera, N. Asokan, Lucas Davi et al.

Remote attestation is a crucial security service particularly relevant to increasingly popular IoT (and other embedded) devices. It allows a trusted party (verifier) to learn the state of a remote, and potentially malware-infected, device (prover). Most existing approaches are static in nature and only check whether benign software is initially loaded on the prover. However, they are vulnerable to run-time attacks that hijack the application's control or data flow, e.g., via return-oriented programming or data-oriented exploits. As a concrete step towards more comprehensive run-time remote attestation, we present the design and implementation of Control- FLow ATtestation (C-FLAT) that enables remote attestation of an application's control-flow path, without requiring the source code. We describe a full prototype implementation of C-FLAT on Raspberry Pi using its ARM TrustZone hardware security extensions. We evaluate C-FLAT's performance using a real-world embedded (cyber-physical) application, and demonstrate its efficacy against control-flow hijacking attacks.