LGJun 29, 2023
Group-based Robustness: A General Framework for Customized Robustness in the Real WorldWeiran Lin, Keane Lucas, Neo Eyal et al. · cmu
Machine-learning models are known to be vulnerable to evasion attacks that perturb model inputs to induce misclassifications. In this work, we identify real-world scenarios where the true threat cannot be assessed accurately by existing attacks. Specifically, we find that conventional metrics measuring targeted and untargeted robustness do not appropriately reflect a model's ability to withstand attacks from one set of source classes to another set of target classes. To address the shortcomings of existing methods, we formally define a new metric, termed group-based robustness, that complements existing metrics and is better-suited for evaluating model performance in certain attack scenarios. We show empirically that group-based robustness allows us to distinguish between models' vulnerability against specific threat models in situations where traditional robustness metrics do not apply. Moreover, to measure group-based robustness efficiently and accurately, we 1) propose two loss functions and 2) identify three new attack strategies. We show empirically that with comparable success rates, finding evasive samples using our new loss functions saves computation by a factor as large as the number of targeted classes, and finding evasive samples using our new attack strategies saves time by up to 99\% compared to brute-force search methods. Finally, we propose a defense method that increases group-based robustness by up to 3.52$\times$.
CRJul 21, 2024
A General Framework for Data-Use Auditing of ML ModelsZonghao Huang, Neil Zhenqiang Gong, Michael K. Reiter
Auditing the use of data in training machine-learning (ML) models is an increasingly pressing challenge, as myriad ML practitioners routinely leverage the effort of content creators to train models without their permission. In this paper, we propose a general method to audit an ML model for the use of a data-owner's data in training, without prior knowledge of the ML task for which the data might be used. Our method leverages any existing black-box membership inference method, together with a sequential hypothesis test of our own design, to detect data use with a quantifiable, tunable false-detection rate. We show the effectiveness of our proposed framework by applying it to audit data use in two types of ML models, namely image classifiers and foundation models.
45.0CRApr 11
Automatic Teller Machines for Offline E-cashAnrin Chakraborti, Qingzhao Zhang, Jingjia Peng et al.
Electronic cash (e-cash) is a digital alternative to physical currency that allows anonymous transactions between users and merchants. Typically, coins in an e-cash scheme are only dispensed through a central bank. A drawback of this approach is that the bank is always on the critical path during withdrawals, and if a reliable connection to the bank is temporarily unavailable, users may be unable to withdraw coins in a timely fashion. As with physical currency, there are benefits to supporting a decentralized infrastructure where withdrawals can be performed without involving the bank in the critical path. We propose the design of a new cryptographic bearer token that can be dispensed by automatic teller machines (ATM) in a fully offline e-cash scheme. Such bearer tokens provide anonymity, unforgeability and untraceability, i.e., users cannot be tracked by their spending activities or the locations of withdrawal. We formalize the requirements of an e-cash scheme with multiple issuers and propose an efficient design building on top of the compact e-cash protocol of Camenisch et al. (EUROCRYPT 2005). Our construction leverages an unforgeable and doubly-anonymous voucher that allows a one-time transfer of coins between an ATM and a user, while hiding their identities from parties not involved in the transaction.
41.5CRApr 17
Half-Moon Cookie: Private, Similarity-Based Blocklisting with TOCTOU-Attack ResilienceXinyuan Zhang, Anrin Chakraborti, Michael K. Reiter
Blocklisting is a common technique for preventing the use of known malicious content. However, conventional blocklisting infrastructures require either the blocklist to be public or clients to reveal their queries to the blocklist server. In this work, we introduce a private blocklisting framework, Half-Moon Cookie, by which a client can check an item against a proprietary blocklist held by a server, to determine whether the item is close to any blocklist element in a metric space. Critically, our design separates the embedding step from the blocklist check, so that performance degrades with their sum and not their product. Still, this check might be too costly to perform on the critical path of using the item, and so our design also supports a very efficient check that an item previously passed the blocklist check. In doing so, we support applications where one client can perform the blocklist check on the item before sending it, and recipients can more efficiently confirm the previous result before using the item, thereby avoiding TOCTOU attacks. We demonstrate how Half-Moon Cookie can be instantiated for similarity-based malware detection, enabling effective identification of malicious executables without revealing client inputs or disclosing the underlying blocklist.
19.1CRApr 23
Incentivizing Collaboration for Detection of Credential Database BreachesMridu Nanda, Michael K. Reiter
Decoy passwords, or ``honeywords,'' alert a site to its breach if entered in a login attempt on that site. However, an attacker can identify a user-chosen password from among the decoys, without alerting the site to its breach, via credential stuffing, i.e., entering the stolen passwords at another site where a user reused her password. Prior work thus proposed that sites monitor for the entry of their honeywords at other sites, but the incentives for sites to participate in this monitoring remain unclear. In this paper, we propose and evaluate an algorithm by which sites can exchange monitoring favors. Through a model-checking analysis, we show that a site can improve its ability to detect its own breach when it increases the monitoring effort it expends for others. We quantify how key parameters impact detection effectiveness and their implications for deploying a monitoring ecosystem. Finally, we evaluate our algorithm on a breached credential dataset, demonstrating effectiveness at scale.
62.3DCMar 25
Rafture: Erasure-coded Raft with Post-Dissemination PruningRithwik Kerur, Divyakant Agrawal, Michael K. Reiter et al.
Spreading and storing erasure-coded data in distributed systems effectively is challenging in real settings. Practical deployments must contend with unpredictable network latencies, particularly when information dispersal is integrated into consensus protocols, a prominent and latency-sensitive use case. Existing approaches address this challenge through timeout-based dissemination and adaptive communication or storage decisions driven by acknowledgments during dissemination. However, these designs focus almost exclusively on dissemination-time efficiency, complicate recovery with reconstruction procedures that require metadata that can differ per consensus value, and rely on a centralized leader to make storage decisions for all nodes. This paper introduces \textbf{Rafture}, a novel information dispersal algorithm, and its integration in a consensus protocol, that overcomes these limitations. Rafture is the first information dispersal solution to incorporate post-dissemination pruning, allowing systems to adapt storage costs after dissemination completes. It employs a simple, fixed-threshold erasure code while varying distinct fragment assignment along a second dimension. This ensures that reconstruction is always possible from $F+1$ fragments using the same interpolation method and no additional metadata. Rafture further enables nodes to adapt autonomously based on locally observed information, eliminating the need for global coordination. We evaluate Rafture in highly dynamic network settings and show that it simplifies recovery while significantly improving long-term storage consumption under variable network conditions.
CRFeb 22, 2024
Mudjacking: Patching Backdoor Vulnerabilities in Foundation ModelsHongbin Liu, Michael K. Reiter, Neil Zhenqiang Gong
Foundation model has become the backbone of the AI ecosystem. In particular, a foundation model can be used as a general-purpose feature extractor to build various downstream classifiers. However, foundation models are vulnerable to backdoor attacks and a backdoored foundation model is a single-point-of-failure of the AI ecosystem, e.g., multiple downstream classifiers inherit the backdoor vulnerabilities simultaneously. In this work, we propose Mudjacking, the first method to patch foundation models to remove backdoors. Specifically, given a misclassified trigger-embedded input detected after a backdoored foundation model is deployed, Mudjacking adjusts the parameters of the foundation model to remove the backdoor. We formulate patching a foundation model as an optimization problem and propose a gradient descent based method to solve it. We evaluate Mudjacking on both vision and language foundation models, eleven benchmark datasets, five existing backdoor attacks, and thirteen adaptive backdoor attacks. Our results show that Mudjacking can remove backdoor from a foundation model while maintaining its utility.
CRMar 28, 2025
Instance-Level Data-Use Auditing of Visual ML ModelsZonghao Huang, Neil Zhenqiang Gong, Michael K. Reiter
The growing trend of legal disputes over the unauthorized use of data in machine learning (ML) systems highlights the urgent need for reliable data-use auditing mechanisms to ensure accountability and transparency in ML. We present the first proactive, instance-level, data-use auditing method designed to enable data owners to audit the use of their individual data instances in ML models, providing more fine-grained auditing results than previous work. To do so, our research generalizes previous work integrating black-box membership inference and sequential hypothesis testing, expanding its scope of application while preserving the quantifiable and tunable false-detection rate that is its hallmark. We evaluate our method on three types of visual ML models: image classifiers, visual encoders, and vision-language models (Contrastive Language-Image Pretraining (CLIP) and Bootstrapping Language-Image Pretraining (BLIP) models). In addition, we apply our method to evaluate the performance of two state-of-the-art approximate unlearning methods. As a noteworthy second contribution, our work reveals that neither method successfully removes the influence of the unlearned data instances from image classifiers and CLIP models, even if sacrificing model utility by $10\%$.
CRMay 10, 2024
Concealing Backdoor Model Updates in Federated Learning by Trigger-Optimized Data PoisoningYujie Zhang, Neil Gong, Michael K. Reiter
Federated Learning (FL) is a decentralized machine learning method that enables participants to collaboratively train a model without sharing their private data. Despite its privacy and scalability benefits, FL is susceptible to backdoor attacks, where adversaries poison the local training data of a subset of clients using a backdoor trigger, aiming to make the aggregated model produce malicious results when the same backdoor condition is met by an inference-time input. Existing backdoor attacks in FL suffer from common deficiencies: fixed trigger patterns and reliance on the assistance of model poisoning. State-of-the-art defenses based on analyzing clients' model updates exhibit a good defense performance on these attacks because of the significant divergence between malicious and benign client model updates. To effectively conceal malicious model updates among benign ones, we propose DPOT, a backdoor attack strategy in FL that dynamically constructs backdoor objectives by optimizing a backdoor trigger, making backdoor data have minimal effect on model updates. We provide theoretical justifications for DPOT's attacking principle and display experimental results showing that DPOT, via only a data-poisoning attack, effectively undermines state-of-the-art defenses and outperforms existing backdoor attack techniques on various datasets.
CRDec 3, 2023
Mendata: A Framework to Purify Manipulated Training DataZonghao Huang, Neil Gong, Michael K. Reiter
Untrusted data used to train a model might have been manipulated to endow the learned model with hidden properties that the data contributor might later exploit. Data purification aims to remove such manipulations prior to training the model. We propose Mendata, a novel framework to purify manipulated training data. Starting from a small reference dataset in which a large majority of the inputs are clean, Mendata perturbs the training inputs so that they retain their utility but are distributed similarly (as measured by Wasserstein distance) to the reference data, thereby eliminating hidden properties from the learned model. A key challenge is how to find such perturbations, which we address by formulating a min-max optimization problem and developing a two-step method to iteratively solve it. We demonstrate the effectiveness of Mendata by applying it to defeat state-of-the-art data poisoning and data tracing techniques.
CRDec 29, 2021
Distance-Aware Private Set IntersectionAnrin Chakraborti, Giulia Fanti, Michael K. Reiter
Private set intersection (PSI) allows two mutually untrusting parties to compute an intersection of their sets, without revealing information about items that are not in the intersection. This work introduces a PSI variant called distance-aware PSI (DA-PSI) for sets whose elements lie in a metric space. DA-PSI returns pairs of items that are within a specified distance threshold of each other. This paper puts forward DA-PSI constructions for two metric spaces: (i) Minkowski distance of order 1 over the set of integers (i.e., for integers $a$ and $b$, their distance is $|a-b|$); and (ii) Hamming distance over the set of binary strings of length $\ell$. In the Minkowski DA-PSI protocol, the communication complexity scales logarithmically in the distance threshold and linearly in the set size. In the Hamming DA-PSI protocol, the communication volume scales quadratically in the distance threshold and is independent of the dimensionality of string length $\ell$. Experimental results with real applications confirm that DA-PSI provides more effective matching at lower cost than naive solutions.
LGDec 28, 2021
Constrained Gradient Descent: A Powerful and Principled Evasion Attack Against Neural NetworksWeiran Lin, Keane Lucas, Lujo Bauer et al.
We propose new, more efficient targeted white-box attacks against deep neural networks. Our attacks better align with the attacker's goal: (1) tricking a model to assign higher probability to the target class than to any other class, while (2) staying within an $ε$-distance of the attacked input. First, we demonstrate a loss function that explicitly encodes (1) and show that Auto-PGD finds more attacks with it. Second, we propose a new attack method, Constrained Gradient Descent (CGD), using a refinement of our loss function that captures both (1) and (2). CGD seeks to satisfy both attacker objectives -- misclassification and bounded $\ell_{p}$-norm -- in a principled manner, as part of the optimization, instead of via ad hoc post-processing techniques (e.g., projection or clipping). We show that CGD is more successful on CIFAR10 (0.9--4.2%) and ImageNet (8.6--13.6%) than state-of-the-art attacks while consuming less time (11.4--18.8%). Statistical tests confirm that our attack outperforms others against leading defenses on different datasets and values of $ε$.
CRAug 3, 2021
Optimally Hiding Object Sizes with Constrained PaddingAndrew C. Reed, Michael K. Reiter
Among the most challenging traffic-analysis attacks to confound are those leveraging the sizes of objects downloaded over the network. In this paper we systematically analyze this problem under realistic constraints regarding the padding overhead that the object store is willing to incur. We give algorithms to compute privacy-optimal padding schemes -- specifically that minimize the network observer's information gain from a downloaded object's padded size -- in several scenarios of interest: per-object padding, in which the object store responds to each request for an object with the same padded copy; per-request padding, in which the object store pads an object anew each time it serves that object; and a scenario unlike the previous ones in that the object store is unable to leverage a known distribution over the object queries. We provide constructions for privacy-optimal padding in each case, compare them to recent contenders in the research literature, and evaluate their performance on practical datasets.
CRAug 7, 2020
Role-Based Deception in Enterprise NetworksIffat Anjum, Mu Zhu, Isaac Polinsky et al.
Historically, enterprise network reconnaissance is an active process, often involving port scanning. However, as routers and switches become more complex, they also become more susceptible to compromise. From this vantage point, an attacker can passively identify high-value hosts such as the workstations of IT administrators, C-suite executives, and finance personnel. The goal of this paper is to develop a technique to deceive and dissuade such adversaries. We propose HoneyRoles, which uses honey connections to build metaphorical haystacks around the network traffic of client hosts belonging to high-value organizational roles. The honey connections also act as network canaries to signal network compromise, thereby dissuading the adversary from acting on information observed in network flows. We design a prototype implementation of HoneyRoles using an OpenFlow SDN controller and evaluate its security using the PRISM probabilistic model checker. Our performance evaluation shows that HoneyRoles has a small effect on network request completion time and our security analysis demonstrates that once an alert is raised, HoneyRoles can quickly identify the compromised switch with high probability. In doing so, we show that a role-based network deception is a promising approach for defending against adversaries that have compromised network devices.
LGMar 24, 2020
Defense Through Diverse DirectionsChristopher M. Bender, Yang Li, Yifeng Shi et al.
In this work we develop a novel Bayesian neural network methodology to achieve strong adversarial robustness without the need for online adversarial training. Unlike previous efforts in this direction, we do not rely solely on the stochasticity of network weights by minimizing the divergence between the learned parameter distribution and a prior. Instead, we additionally require that the model maintain some expected uncertainty with respect to all input covariates. We demonstrate that by encouraging the network to distribute evenly across inputs, the network becomes less susceptible to localized, brittle features which imparts a natural robustness to targeted perturbations. We show empirical robustness on several benchmark datasets.
CRDec 27, 2019
TASE: Reducing latency of symbolic execution with transactional memoryAdam Humphries, Kartik Cating-Subramanian, Michael K. Reiter
We present the design and implementation of a tool called TASE that uses transactional memory to reduce the latency of symbolic-execution applications with small amounts of symbolic state. Execution paths are executed natively while operating on concrete values, and only when execution encounters symbolic values (or modeled functions) is native execution suspended and interpretation begun. Execution then returns to its native mode when symbolic values are no longer encountered. The key innovations in the design of TASE are a technique for amortizing the cost of checking whether values are symbolic over few instructions, and the use of hardware-supported transactional memory (TSX) to implement native execution that rolls back with no effect when use of a symbolic value is detected (perhaps belatedly). We show that TASE has the potential to dramatically improve some latency-sensitive applications of symbolic execution, such as methods to verify the behavior of a client in a client-server application.
CRDec 23, 2019
Detecting stuffing of a user's credentials at her own accountsKe Coby Wang, Michael K. Reiter
We propose a framework by which websites can coordinate to detect credential stuffing on individual user accounts. Our detection algorithm teases apart normal login behavior (involving password reuse, entering correct passwords into the wrong sites, etc.) from credential stuffing, by leveraging modern anomaly detection and carefully tracking suspicious logins. Websites coordinate using a novel private membership-test protocol, thereby ensuring that information about passwords is not leaked; this protocol is highly scalable, partly due to its use of cuckoo filters, and is more secure than similarly scalable alternatives in an important measure that we define. We use probabilistic model checking to estimate our credential-stuffing detection accuracy across a range of operating points. These methods might be of independent interest for their novel application of formal methods to estimate the usability impacts of our design. We show that even a minimal-infrastructure deployment of our framework should already support the combined login load experienced by the airline, hotel, retail, and consumer banking industries in the U.S.
CRDec 19, 2019
Malware Makeover: Breaking ML-based Static Analysis by Modifying Executable BytesKeane Lucas, Mahmood Sharif, Lujo Bauer et al.
Motivated by the transformative impact of deep neural networks (DNNs) in various domains, researchers and anti-virus vendors have proposed DNNs for malware detection from raw bytes that do not require manual feature engineering. In this work, we propose an attack that interweaves binary-diversification techniques and optimization frameworks to mislead such DNNs while preserving the functionality of binaries. Unlike prior attacks, ours manipulates instructions that are a functional part of the binary, which makes it particularly challenging to defend against. We evaluated our attack against three DNNs in white- and black-box settings, and found that it often achieved success rates near 100%. Moreover, we found that our attack can fool some commercial anti-viruses, in certain cases with a success rate of 85%. We explored several defenses, both new and old, and identified some that can foil over 80% of our evasion attempts. However, these defenses may still be susceptible to evasion by attacks, and so we advocate for augmenting malware-detection systems with methods that do not rely on machine learning.
CVDec 19, 2019
$n$-ML: Mitigating Adversarial Examples via Ensembles of Topologically Manipulated ClassifiersMahmood Sharif, Lujo Bauer, Michael K. Reiter
This paper proposes a new defense called $n$-ML against adversarial examples, i.e., inputs crafted by perturbing benign inputs by small amounts to induce misclassifications by classifiers. Inspired by $n$-version programming, $n$-ML trains an ensemble of $n$ classifiers, and inputs are classified by a vote of the classifiers in the ensemble. Unlike prior such approaches, however, the classifiers in the ensemble are trained specifically to classify adversarial examples differently, rendering it very difficult for an adversarial example to obtain enough votes to be misclassified. We show that $n$-ML roughly retains the benign classification accuracies of state-of-the-art models on the MNIST, CIFAR10, and GTSRB datasets, while simultaneously defending against adversarial examples with better resilience than the best defenses known to date and, in most cases, with lower classification-time overhead.
CROct 18, 2019
n-m-Variant Systems: Adversarial-Resistant Software Rejuvenation for Cloud-Based Web ApplicationsIsaac Polinsky, Kyle Martin, William Enck et al.
Web servers are a popular target for adversaries as they are publicly accessible and often vulnerable to compromise. Compromises can go unnoticed for months, if not years, and recovery often involves a complete system rebuild. In this paper, we propose n-m-Variant Systems, an adversarial-resistant software rejuvenation framework for cloud-based web applications. We improve the state-of-the-art by introducing a variable m that provides a knob for administrators to tune an environment to balance resource usage, performance overhead, and security guarantees. Using m, security guarantees can be tuned for seconds, minutes, days, or complete resistance. We design and implement an n-m-Variant System prototype to protect a Mediawiki PHP application serving dynamic content from an external SQL persistent storage. Our performance evaluation shows a throughput reduction of 65% for 108 seconds of resistance and 83% for 12 days of resistance to sophisticated adversaries, given appropriate resource allocation. Furthermore, we use theoretical analysis and simulation to characterize the impact of system parameters on resilience to adversaries. Through these efforts, our work demonstrates how properties of cloud-based servers can enhance the integrity of Web servers.
CRJul 10, 2018
sAVSS: Scalable Asynchronous Verifiable Secret Sharing in BFT ProtocolsSoumya Basu, Alin Tomescu, Ittai Abraham et al.
This paper introduces a new way to incorporate verifiable secret sharing (VSS) schemes into Byzantine Fault Tolerance (BFT) protocols. This technique extends the threshold guarantee of classical Byzantine Fault Tolerant algorithms to include privacy as well. This provides applications with a powerful primitive: a threshold trusted third party, which simplifies many difficult problems such as a fair exchange. In order to incorporate VSS into BFT, we introduced sAVSS, a framework that transforms any VSS scheme into an asynchronous VSS scheme with constant overhead. By incorporating Kate et al.'s scheme into our framework, we obtain an asynchronous VSS that has constant overhead on each replica -- the first of its kind. We show that a key-value store built using BFT replication and sAVSS supports writing secret-shared values with about a 30% - 50% throughput overhead with less than 35 millisecond request latencies.
CRMay 1, 2018
How to end password reuse on the webKe Coby Wang, Michael K. Reiter
We present a framework by which websites can coordinate to make it difficult for users to set similar passwords at these websites, in an effort to break the culture of password reuse on the web today. Though the design of such a framework is fraught with risks to users' security and privacy, we show that these risks can be effectively mitigated through careful scoping of the goals for such a framework and through principled design. At the core of our framework is a private set-membership-test protocol that enables one website to determine, upon a user setting a password for use at it, whether that user has already set a similar password at another participating website, but with neither side disclosing to the other the password(s) it employs in the protocol. Our framework then layers over this protocol a collection of techniques to mitigate the leakage necessitated by such a test. We verify via probabilistic model checking that these techniques are effective in maintaining account security, and since these mechanisms are consistent with common user experience today, our framework should be unobtrusive to users who do not reuse similar passwords across websites (e.g., due to having adopted a password manager). Through a working implementation of our framework and optimization of its parameters based on insights of how passwords tend to be reused, we show that our design can meet the scalability challenges facing such a service.
CRFeb 27, 2018
On the Suitability of $L_p$-norms for Creating and Preventing Adversarial ExamplesMahmood Sharif, Lujo Bauer, Michael K. Reiter
Much research effort has been devoted to better understanding adversarial examples, which are specially crafted inputs to machine-learning models that are perceptually similar to benign inputs, but are classified differently (i.e., misclassified). Both algorithms that create adversarial examples and strategies for defending against them typically use $L_p$-norms to measure the perceptual similarity between an adversarial input and its benign original. Prior work has already shown, however, that two images need not be close to each other as measured by an $L_p$-norm to be perceptually similar. In this work, we show that nearness according to an $L_p$-norm is not just unnecessary for perceptual similarity, but is also insufficient. Specifically, focusing on datasets (CIFAR10 and MNIST), $L_p$-norms, and thresholds used in prior work, we show through online user studies that "adversarial examples" that are closer to their benign counterparts than required by commonly used $L_p$-norm thresholds can nevertheless be perceptually different to humans from the corresponding benign examples. Namely, the perceptual distance between two images that are "near" each other according to an $L_p$-norm can be high enough that participants frequently classify the two images as representing different objects or digits. Combined with prior work, we thus demonstrate that nearness of inputs as measured by $L_p$-norms is neither necessary nor sufficient for perceptual similarity, which has implications for both creating and defending against adversarial examples. We propose and discuss alternative similarity metrics to stimulate future research in the area.
CVDec 31, 2017
A General Framework for Adversarial Examples with ObjectivesMahmood Sharif, Sruti Bhagavatula, Lujo Bauer et al.
Images perturbed subtly to be misclassified by neural networks, called adversarial examples, have emerged as a technically deep challenge and an important concern for several application domains. Most research on adversarial examples takes as its only constraint that the perturbed images are similar to the originals. However, real-world application of these ideas often requires the examples to satisfy additional objectives, which are typically enforced through custom modifications of the perturbation process. In this paper, we propose adversarial generative nets (AGNs), a general methodology to train a generator neural network to emit adversarial examples satisfying desired objectives. We demonstrate the ability of AGNs to accommodate a wide range of objectives, including imprecise ones difficult to model, in two application domains. In particular, we demonstrate physical adversarial examples---eyeglass frames designed to fool face recognition---with better robustness, inconspicuousness, and scalability than previous approaches, as well as a new attack to fool a handwritten-digit classifier.
CRSep 9, 2016
Stealing Machine Learning Models via Prediction APIsFlorian Tramèr, Fan Zhang, Ari Juels et al.
Machine learning (ML) models may be deemed confidential due to their sensitive training data, commercial value, or use in security applications. Increasingly often, confidential ML models are being deployed with publicly accessible query interfaces. ML-as-a-service ("predictive analytics") systems are an example: Some allow users to train models on potentially sensitive data and charge others for access on a pay-per-query basis. The tension between model confidentiality and public access motivates our investigation of model extraction attacks. In such attacks, an adversary with black-box access, but no prior knowledge of an ML model's parameters or training data, aims to duplicate the functionality of (i.e., "steal") the model. Unlike in classical learning theory settings, ML-as-a-service offerings may accept partial feature vectors as inputs and include confidence values with predictions. Given these practices, we show simple, efficient attacks that extract target ML models with near-perfect fidelity for popular model classes including logistic regression, neural networks, and decision trees. We demonstrate these attacks against the online services of BigML and Amazon Machine Learning. We further show that the natural countermeasure of omitting confidence values from model outputs still admits potentially harmful model extraction attacks. Our results highlight the need for careful ML model deployment and new model extraction countermeasures.
CRMar 17, 2016
A software approach to defeating side channels in last-level cachesZiqiao Zhou, Michael K. Reiter, Yinqian Zhang
We present a software approach to mitigate access-driven side-channel attacks that leverage last-level caches (LLCs) shared across cores to leak information between security domains (e.g., tenants in a cloud). Our approach dynamically manages physical memory pages shared between security domains to disable sharing of LLC lines, thus preventing "Flush-Reload" side channels via LLCs. It also manages cacheability of memory pages to thwart cross-tenant "Prime-Probe" attacks in LLCs. We have implemented our approach as a memory management subsystem called CacheBar within the Linux kernel to intervene on such side channels across container boundaries, as containers are a common method for enforcing tenant isolation in Platform-as-a-Service (PaaS) clouds. Through formal verification, principled analysis, and empirical evaluation, we show that CacheBar achieves strong security with small performance overheads for PaaS workloads.
CRMar 13, 2016
Server-side verification of client behavior in cryptographic protocolsAndrew Chi, Robert Cochran, Marie Nesfield et al.
Numerous exploits of client-server protocols and applications involve modifying clients to behave in ways that untampered clients would not, such as crafting malicious packets. In this paper, we demonstrate practical verification of a cryptographic protocol client's messaging behavior as being consistent with the client program it is believed to be running. Moreover, we accomplish this without modifying the client in any way, and without knowing all of the client-side inputs driving its behavior. Our toolchain for verifying a client's messages explores multiple candidate execution paths in the client concurrently, an innovation that we show is both specifically useful for cryptographic protocol clients and more generally useful for client applications of other types, as well. In addition, our toolchain includes a novel approach to symbolically executing the client software in multiple passes that defers expensive functions until their inputs can be inferred and concretized. We demonstrate client verification on OpenSSL to show that, e.g., Heartbleed exploits can be detected without Heartbleed-specific filtering and within seconds of the first malicious packet, and that verification of legitimate clients can keep pace with, e.g., Gmail workloads.