CRJan 3, 2025Code
Rerouting LLM RoutersAvital Shafran, Roei Schuster, Thomas Ristenpart et al.
LLM routers aim to balance quality and cost of generation by classifying queries and routing them to a cheaper or more expensive LLM depending on their complexity. Routers represent one type of what we call LLM control planes: systems that orchestrate use of one or more LLMs. In this paper, we investigate routers' adversarial robustness. We first define LLM control plane integrity, i.e., robustness of LLM orchestration to adversarial inputs, as a distinct problem in AI safety. Next, we demonstrate that an adversary can generate query-independent token sequences we call ``confounder gadgets'' that, when added to any query, cause LLM routers to send the query to a strong LLM. Our quantitative evaluation shows that this attack is successful both in white-box and black-box settings against a variety of open-source and commercial routers, and that confounding queries do not affect the quality of LLM responses. Finally, we demonstrate that gadgets can be effective while maintaining low perplexity, thus perplexity-based filtering is not an effective defense. We finish by investigating alternative defenses.
CRNov 10, 2020Code
Guarding Serverless Applications with SecLambdaDeepak Sirone Jegan, Liang Wang, Siddhant Bhagat et al.
As an emerging application paradigm, serverless computing attracts attention from more and more attackers. Unfortunately, security tools for conventional applications cannot be easily ported to serverless, and existing serverless security solutions are inadequate. In this paper, we present \emph{SecLambda}, an extensible security framework that leverages local function state and global application state to perform sophisticated security tasks to protect an application. We show how SecLambda can be used to achieve control flow integrity, credential protection, and rate limiting in serverless applications. We evaluate the performance overhead and security of SecLambda using realistic open-source applications, and our results suggest that SecLambda can mitigate several attacks while introducing relatively low performance overhead.
CRSep 29, 2021
Might I Get Pwned: A Second Generation Compromised Credential Checking ServiceBijeeta Pal, Mazharul Islam, Marina Sanusi et al.
Credential stuffing attacks use stolen passwords to log into victim accounts. To defend against these attacks, recently deployed compromised credential checking (C3) services provide APIs that help users and companies check whether a username, password pair is exposed. These services however only check if the exact password is leaked, and therefore do not mitigate credential tweaking attacks - attempts to compromise a user account with variants of a user's leaked passwords. Recent work has shown credential tweaking attacks can compromise accounts quite effectively even when the credential stuffing countermeasures are in place. We initiate work on C3 services that protect users from credential tweaking attacks. The core underlying challenge is how to identify passwords that are similar to their leaked passwords while preserving honest clients' privacy and also preventing malicious clients from extracting breach data from the service. We formalize the problem and explore ways to measure password similarity that balance efficacy, performance, and security. Based on this study, we design "Might I Get Pwned" (MIGP), a new kind of breach alerting service. Our simulations show that MIGP reduces the efficacy of state-of-the-art 1000-guess credential tweaking attacks by 94%. MIGP preserves user privacy and limits potential exposure of sensitive breach entries. We show that the protocol is fast, with response time close to existing C3 services. We worked with Cloudflare to deploy MIGP in practice.
CRSep 3, 2021
Increasing Adversarial Uncertainty to Scale Private Similarity TestingYiqing Hua, Armin Namavari, Kaishuo Cheng et al.
Social media and other platforms rely on automated detection of abusive content to help combat disinformation, harassment, and abuse. One common approach is to check user content for similarity against a server-side database of problematic items. However, this method fundamentally endangers user privacy. Instead, we target client-side detection, notifying only the users when such matches occur to warn them against abusive content. Our solution is based on privacy-preserving similarity testing. Existing approaches rely on expensive cryptographic protocols that do not scale well to large databases and may sacrifice the correctness of the matching. To contend with this challenge, we propose and formalize the concept of similarity-based bucketization~(SBB). With SBB, a client reveals a small amount of information to a database-holding server so that it can generate a bucket of potentially similar items. The bucket is small enough for efficient application of privacy-preserving protocols for similarity. To analyze the privacy risk of the revealed information, we introduce a framework for measuring an adversary's confidence in inferring a predicate about the client input correctly. We develop a practical SBB protocol for image content, and evaluate its client privacy guarantee with real-world social media data. We then combine SBB with various similarity protocols, showing that the combination with SBB provides a speedup of at least 29x on large-scale databases compared to that without, while retaining correctness of over 95%.
CRMay 28, 2020
The Tools and Tactics Used in Intimate Partner Surveillance: An Analysis of Online Infidelity ForumsEmily Tseng, Rosanna Bellini, Nora McDonald et al.
Abusers increasingly use spyware apps, account compromise, and social engineering to surveil their intimate partners, causing substantial harms that can culminate in violence. This form of privacy violation, termed intimate partner surveillance (IPS), is a profoundly challenging problem to address due to the physical access and trust present in the relationship between the target and attacker. While previous research has examined IPS from the perspectives of survivors, we present the first measurement study of online forums in which (potential) attackers discuss IPS strategies and techniques. In domains such as cybercrime, child abuse, and human trafficking, studying the online behaviors of perpetrators has led to better threat intelligence and techniques to combat attacks. We aim to provide similar insights in the context of IPS. We identified five online forums containing discussion of monitoring cellphones and other means of surveilling an intimate partner, including three within the context of investigating relationship infidelity. We perform a mixed-methods analysis of these forums, surfacing the tools and tactics that attackers use to perform surveillance. Via qualitative analysis of forum content, we present a taxonomy of IPS strategies used and recommended by attackers, and synthesize lessons for technologists seeking to curb the spread of IPS.
HCMay 9, 2020
Characterizing Twitter Users Who Engage in Adversarial Interactions against Political CandidatesYiqing Hua, Mor Naaman, Thomas Ristenpart
Social media provides a critical communication platform for political figures, but also makes them easy targets for harassment. In this paper, we characterize users who adversarially interact with political figures on Twitter using mixed-method techniques. The analysis is based on a dataset of 400~thousand users' 1.2~million replies to 756 candidates for the U.S. House of Representatives in the two months leading up to the 2018 midterm elections. We show that among moderately active users, adversarial activity is associated with decreased centrality in the social graph and increased attention to candidates from the opposing party. When compared to users who are similarly active, highly adversarial users tend to engage in fewer supportive interactions with their own party's candidates and express negativity in their user profiles. Our results can inform the design of platform moderation mechanisms to support political figures countering online harassment.
HCMay 9, 2020
Towards Measuring Adversarial Twitter Interactions against Candidates in the US Midterm ElectionsYiqing Hua, Thomas Ristenpart, Mor Naaman
Adversarial interactions against politicians on social media such as Twitter have significant impact on society. In particular they disrupt substantive political discussions online, and may discourage people from seeking public office. In this study, we measure the adversarial interactions against candidates for the US House of Representatives during the run-up to the 2018 US general election. We gather a new dataset consisting of 1.7 million tweets involving candidates, one of the largest corpora focusing on political discourse. We then develop a new technique for detecting tweets with toxic content that are directed at any specific candidate.Such technique allows us to more accurately quantify adversarial interactions towards political candidates. Further, we introduce an algorithm to induce candidate-specific adversarial terms to capture more nuanced adversarial interactions that previous techniques may not consider toxic. Finally, we use these techniques to outline the breadth of adversarial interactions seen in the election, including offensive name-calling, threats of violence, posting discrediting information, attacks on identity, and adversarial message repetition.
CRMay 31, 2019
Protocols for Checking Compromised CredentialsLucy Li, Bijeeta Pal, Junade Ali et al.
To prevent credential stuffing attacks, industry best practice now proactively checks if user credentials are present in known data breaches. Recently, some web services, such as HaveIBeenPwned (HIBP) and Google Password Checkup (GPC), have started providing APIs to check for breached passwords. We refer to such services as compromised credential checking (C3) services. We give the first formal description of C3 services, detailing different settings and operational requirements, and we give relevant threat models. One key security requirement is the secrecy of a user's passwords that are being checked. Current widely deployed C3 services have the user share a small prefix of a hash computed over the user's password. We provide a framework for empirically analyzing the leakage of such protocols, showing that in some contexts knowing the hash prefixes leads to a 12x increase in the efficacy of remote guessing attacks. We propose two new protocols that provide stronger protection for users' passwords, implement them, and show experimentally that they remain practical to deploy.
CRFeb 9, 2018
When Textbook RSA is Used to Protect the Privacy of Hundreds of Millions of UsersJeffrey Knockel, Thomas Ristenpart, Jedidiah Crandall
We evaluate Tencent's QQ Browser, a popular mobile browser in China with hundreds of millions of users---including 16 million overseas, with respect to the threat model of a man-in-the-middle attacker with state actor capabilities. This is motivated by information in the Snowden revelations suggesting that another Chinese mobile browser, UC Browser, was being used to track users by Western nation-state adversaries. Among the many issues we found in QQ Browser that are presented in this paper, the use of "textbook RSA"---that is, RSA implemented as shown in textbooks, with no padding---is particularly interesting because it affords us the opportunity to contextualize existing research in breaking textbook RSA. We also present a novel attack on QQ Browser's use of textbook RSA that is distinguished from previous research by its simplicity. We emphasize that although QQ Browser's cryptography and our attacks on it are very simple, the impact is serious. Thus, research into how to break very poor cryptography (such as textbook RSA) has both pedagogical value and real-world impact.
CRSep 22, 2017
Machine Learning Models that Remember Too MuchCongzheng Song, Thomas Ristenpart, Vitaly Shmatikov
Machine learning (ML) is becoming a commodity. Numerous ML frameworks and services are available to data holders who are not ML experts but want to train predictive models on their data. It is important that ML models trained on sensitive inputs (e.g., personal images or documents) not leak too much information about the training data. We consider a malicious ML provider who supplies model-training code to the data holder, does not observe the training, but then obtains white- or black-box access to the resulting model. In this setting, we design and implement practical algorithms, some of them very similar to standard ML techniques such as regularization and data augmentation, that "memorize" information about the training dataset in the model yet the model is as accurate and predictive as a conventionally trained model. We then explain how the adversary can extract memorized information from the model. We evaluate our techniques on standard ML tasks for image classification (CIFAR10), face recognition (LFW and FaceScrub), and text analysis (20 Newsgroups and IMDB). In all cases, we show how our algorithms create models that have high predictive power yet allow accurate extraction of subsets of their training data.
CRSep 9, 2016
Stealing Machine Learning Models via Prediction APIsFlorian Tramèr, Fan Zhang, Ari Juels et al.
Machine learning (ML) models may be deemed confidential due to their sensitive training data, commercial value, or use in security applications. Increasingly often, confidential ML models are being deployed with publicly accessible query interfaces. ML-as-a-service ("predictive analytics") systems are an example: Some allow users to train models on potentially sensitive data and charge others for access on a pay-per-query basis. The tension between model confidentiality and public access motivates our investigation of model extraction attacks. In such attacks, an adversary with black-box access, but no prior knowledge of an ML model's parameters or training data, aims to duplicate the functionality of (i.e., "steal") the model. Unlike in classical learning theory settings, ML-as-a-service offerings may accept partial feature vectors as inputs and include confidence values with predictions. Given these practices, we show simple, efficient attacks that extract target ML models with near-perfect fidelity for popular model classes including logistic regression, neural networks, and decision trees. We demonstrate these attacks against the online services of BigML and Amazon Machine Learning. We further show that the natural countermeasure of omitting confidence values from model outputs still admits potentially harmful model extraction attacks. Our results highlight the need for careful ML model deployment and new model extraction countermeasures.
NIMay 13, 2016
Network Traffic Obfuscation and Automated Internet CensorshipLucas Dixon, Thomas Ristenpart, Thomas Shrimpton
Internet censors seek ways to identify and block internet access to information they deem objectionable. Increasingly, censors deploy advanced networking tools such as deep-packet inspection (DPI) to identify such connections. In response, activists and academic researchers have developed and deployed network traffic obfuscation mechanisms. These apply specialized cryptographic tools to attempt to hide from DPI the true nature and content of connections. In this survey, we give an overview of network traffic obfuscation and its role in circumventing Internet censorship. We provide historical and technical background that motivates the need for obfuscation tools, and give an overview of approaches to obfuscation used by state of the art tools. We discuss the latest research on how censors might detect these efforts. We also describe the current challenges to censorship circumvention research and identify concrete ways for the community to address these challenges.
CRJul 11, 2015
A Placement Vulnerability Study in Multi-tenant Public CloudsVenkatanathan Varadarajan, Yinqian Zhang, Thomas Ristenpart et al.
Public infrastructure-as-a-service clouds, such as Amazon EC2, Google Compute Engine (GCE) and Microsoft Azure allow clients to run virtual machines (VMs) on shared physical infrastructure. This practice of multi-tenancy brings economies of scale, but also introduces the risk of sharing a physical server with an arbitrary and potentially malicious VM. Past works have demonstrated how to place a VM alongside a target victim (co-location) in early-generation clouds and how to extract secret information via side- channels. Although there have been numerous works on side-channel attacks, there have been no studies on placement vulnerabilities in public clouds since the adoption of stronger isolation technologies such as Virtual Private Clouds (VPCs). We investigate this problem of placement vulnerabilities and quantitatively evaluate three popular public clouds for their susceptibility to co-location attacks. We find that adoption of new technologies (e.g., VPC) makes many prior attacks, such as cloud cartography, ineffective. We find new ways to reliably test for co-location across Amazon EC2, Google GCE, and Microsoft Azure. We also found ways to detect co-location with victim web servers in a multi-tiered cloud application located behind a load balancer. We use our new co-residence tests and multiple customer accounts to launch VM instances under different strategies that seek to maximize the likelihood of co-residency. We find that it is much easier (10x higher success rate) and cheaper (up to $114 less) to achieve co-location in these three clouds when compared to a secure reference placement policy.