Gururaj Saileshwar

CR
h-index15
8papers
44citations
Novelty69%
AI Score50

8 Papers

LGDec 19, 2024Code
Time Will Tell: Timing Side Channels via Output Token Count in Large Language Models

Tianchen Zhang, Gururaj Saileshwar, David Lie

This paper demonstrates a new side-channel that enables an adversary to extract sensitive information about inference inputs in large language models (LLMs) based on the number of output tokens in the LLM response. We construct attacks using this side-channel in two common LLM tasks: recovering the target language in machine translation tasks and recovering the output class in classification tasks. In addition, due to the auto-regressive generation mechanism in LLMs, an adversary can recover the output token count reliably using a timing channel, even over the network against a popular closed-source commercial LLM. Our experiments show that an adversary can learn the output language in translation tasks with more than 75% precision across three different models (Tower, M2M100, MBart50). Using this side-channel, we also show the input class in text classification tasks can be leaked out with more than 70% precision from open-source LLMs like Llama-3.1, Llama-3.2, Gemma2, and production models like GPT-4o. Finally, we propose tokenizer-, system-, and prompt-based mitigations against the output token count side-channel.

CRDec 10, 2024Code
PrisonBreak: Jailbreaking Large Language Models with at Most Twenty-Five Targeted Bit-flips

Zachary Coalson, Jeonghyun Woo, Chris S. Lin et al.

We study a new vulnerability in commercial-scale safety-aligned large language models (LLMs): their refusal to generate harmful responses can be broken by flipping only a few bits in model parameters. Our attack jailbreaks billion-parameter language models with just 5 to 25 bit-flips, requiring up to 40$\times$ fewer bit flips than prior attacks on much smaller computer vision models. Unlike prompt-based jailbreaks, our method directly uncensors models in memory at runtime, enabling harmful outputs without requiring input-level modifications. Our key innovation is an efficient bit-selection algorithm that identifies critical bits for language model jailbreaks up to 20$\times$ faster than prior methods. We evaluate our attack on 10 open-source LLMs, achieving high attack success rates (ASRs) of 80-98% with minimal impact on model utility. We further demonstrate an end-to-end exploit via Rowhammer-based fault injection, reliably jailbreaking 5 models (69-91% ASR) on a GDDR6 GPU. Our analyses reveal that: (1) models with weaker post-training alignment require fewer bit-flips to jailbreak; (2) certain model components, e.g., value projection layers, are substantially more vulnerable; and (3) the attack is mechanistically different from existing jailbreak methods. We evaluate potential countermeasures and find that our attack remains effective against defenses at various stages of the LLM pipeline.

35.0CRMay 5
GPUBreach: Privilege Escalation Attacks on GPUs using Rowhammer

Chris S. Lin, Yuqin Yan, Guozhen Ding et al.

NVIDIA GPUs with GDDR memories have been shown susceptible to Rowhammer-based bit-flips, similar to CPUs. However, Rowhammer exploits on GPUs have been limited to injecting untargeted bit-flips in victim data like weights of machine learning models, to degrade model accuracy, unlike CPU exploits shown capable of privilege escalation. In this paper, we demonstrate that GPU Rowhammer exploits can be as potent as CPU Rowhammer attacks. By exploiting the GPU page table management to identify when and where new page tables are allocated, we enable an unprivileged user CUDA kernel of one process to use RowHammer bit-flips to gain access to the GPU memory of other processes or co-tenants via targeted tampering of such page-tables resident on the GPU memory. Using this newly found primitive, we demonstrate the first GPU-side privilege escalation attacks, leaking secret data such as cryptographic keys from cuPQC libraries, and even tampering with the model's GPU assembly code to degrade models more stealthily than previous attacks. We further demonstrate that GPU-side privilege escalation can lead to CPU-side privilege escalation, defeating the protections provided by the IOMMU, enabling a malicious user-level program with GPU access to gain root shell and system-wide control, even in a non-multi-tenant setting.

CLNov 1, 2024
When Speculation Spills Secrets: Side Channels via Speculative Decoding In LLMs

Jiankun Wei, Abdulrahman Abdulrazzag, Tianchen Zhang et al.

Deployed large language models (LLMs) often rely on speculative decoding, a technique that generates and verifies multiple candidate tokens in parallel, to improve throughput and latency. In this work, we reveal a new side-channel whereby input-dependent patterns of correct and incorrect speculations can be inferred by monitoring per-iteration token counts or packet sizes.We demonstrate that an adversary observing these patterns can fingerprint user queries with >90% accuracy across four speculative-decoding schemes, REST (100\%), LADE (up to 92%), BiLD (up to 95%), and EAGLE (up to 77.6%) and leak confidential datastore contents used for prediction at rates exceeding 25 tokens/sec. We evaluate the side-channel attacks in both research prototypes as well as the production-grade vLLM serving framework. To defend against these, we propose and evaluate a suite of mitigations, including packet padding and iteration-wise token aggregation.

LGOct 19, 2025
CLIP: Client-Side Invariant Pruning for Mitigating Stragglers in Secure Federated Learning

Anthony DiMaggio, Raghav Sharma, Gururaj Saileshwar

Secure federated learning (FL) preserves data privacy during distributed model training. However, deploying such frameworks across heterogeneous devices results in performance bottlenecks, due to straggler clients with limited computational or network capabilities, slowing training for all participating clients. This paper introduces the first straggler mitigation technique for secure aggregation with deep neural networks. We propose CLIP, a client-side invariant neuron pruning technique coupled with network-aware pruning, that addresses compute and network bottlenecks due to stragglers during training with minimal accuracy loss. Our technique accelerates secure FL training by 13% to 34% across multiple datasets (CIFAR10, Shakespeare, FEMNIST) with an accuracy impact of between 1.3% improvement to 2.6% reduction.

CRNov 16, 2021
Practical Timing Side Channel Attacks on Memory Compression

Martin Schwarzl, Pietro Borrello, Gururaj Saileshwar et al.

Compression algorithms are widely used as they save memory without losing data. However, elimination of redundant symbols and sequences in data leads to a compression side channel. So far, compression attacks have only focused on the compression-ratio side channel, i.e., the size of compressed data,and largely targeted HTTP traffic and website content. In this paper, we present the first memory compression attacks exploiting timing side channels in compression algorithms, targeting a broad set of applications using compression. Our work systematically analyzes different compression algorithms and demonstrates timing leakage in each. We present Comprezzor,an evolutionary fuzzer which finds memory layouts that lead to amplified latency differences for decompression and therefore enable remote attacks. We demonstrate a remote covert channel exploiting small local timing differences transmitting on average 643.25 bit/h over 14 hops over the internet. We also demonstrate memory compression attacks that can leak secrets bytewise as well as in dictionary attacks in three different case studies. First, we show that an attacker can disclose secrets co-located and compressed with attacker data in PHP applications using Memcached. Second, we present an attack that leaks database records from PostgreSQL, managed by a Python-Flask application, over the internet. Third, we demonstrate an attack that leaks secrets from transparently compressed pages with ZRAM,the memory compression module in Linux. We conclude that memory-compression attacks are a practical threat.

CRSep 18, 2020
MIRAGE: Mitigating Conflict-Based Cache Attacks with a Practical Fully-Associative Design

Gururaj Saileshwar, Moinuddin Qureshi

Shared processor caches are vulnerable to conflict-based side-channel attacks, where an attacker can monitor access patterns of a victim by evicting victim cache lines using cache-set conflicts. Recent mitigations propose randomized mapping of addresses to cache lines to obfuscate the locations of set-conflicts. However, these are vulnerable to new attacks that discover conflicting sets of addresses despite such mitigations, because these designs select eviction-candidates from a small set of conflicting lines. This paper presents Mirage, a practical design for a fully associative cache, wherein eviction candidates are selected randomly from all lines resident in the cache, to be immune to set-conflicts. A key challenge for enabling such designs in large shared caches (containing tens of thousands of cache lines) is the complexity of cache-lookup, as a naive design can require searching through all the resident lines. Mirage achieves full-associativity while retaining practical set-associative lookups by decoupling placement and replacement, using pointer-based indirection from tag-store to data-store to allow a newly installed address to globally evict the data of any random resident line. To eliminate set-conflicts, Mirage provisions extra invalid tags in a skewed-associative tag-store design where lines can be installed without set-conflict, along with a load-aware skew-selection policy that guarantees the availability of sets with invalid tags. Our analysis shows Mirage provides the global eviction property of a fully-associative cache throughout system lifetime (violations of full-associativity, i.e. set-conflicts, occur less than once in 10^4 to 10^17 years), thus offering a principled defense against any eviction-set discovery and any potential conflict based attacks. Mirage incurs limited slowdown (2%) and 17-20% extra storage compared to a non-secure cache.

CRJun 6, 2019
Lookout for Zombies: Mitigating Flush+Reload Attack on Shared Caches by Monitoring Invalidated Lines

Gururaj Saileshwar, Moinuddin K. Qureshi

OS-based page sharing is a commonly used optimization in modern systems to reduce memory footprint. Unfortunately, such sharing can cause Flush+Reload cache attacks, whereby a spy periodically flushes a cache line of shared data (using the clflush instruction) and reloads it to infer the access patterns of a victim application. Current proposals to mitigate Flush+Reload attacks are impractical as they either disable page sharing, or require application rewrite, or require OS support, or incur ISA changes. Ideally, we want to tolerate attacks without requiring any OS or ISA support and while incurring negligible performance and storage overheads. This paper makes the key observation that when a cache line is invalidated due to a Flush-Caused Invalidation (FCI), the tag and data of the invalidated line are still resident in the cache and can be used for detecting Flush-based attacks. We call lines invalidated due to FCI as Zombie lines. Our design explicitly marks such lines as Zombies, preserves the Zombie lines in the cache, and uses the hits and misses to Zombie lines to tolerate the attacks. We propose Zombie-Based Mitigation (ZBM), a simple hardware-based design that successfully guards against attacks by simply treating hits on Zombie-lines as misses to avoid any timing leaks to the attacker. We analyze the robustness of ZBM using three spy programs: attacking AES T-Tables, attacking RSA Square-and-Multiply, and Function Watcher (FW), and show that ZBM successfully defends against these attacks. Our solution requires negligible storage (4-bits per cache line), retains OS-based page sharing, requires no OS/ISA changes, and does not incur slowdown for benign applications.