Pritam Dash

CR
h-index10
4papers
31citations
Novelty60%
AI Score49

4 Papers

65.6CRJun 3
From Untrusted Input to Trusted Memory: A Systematic Study of Memory Poisoning Attacks in LLM Agents

Pritam Dash, Tongyu Ge, Aditi Jain et al.

Memory is a core component of AI agents, enabling them to accumulate knowledge across interactions and improve performance. However, persistent memory introduces the risk of memory poisoning, where a single adversarial memory write can exert long-term influence over agent behavior. We present a systematic study of memory poisoning in LLM-based agents. We identify four memory write channels and nine structural vulnerabilities in model capabilities, system prompt design, and agent system architecture that make these channels exploitable. Based on these vulnerabilities, we develop a taxonomy of six classes of memory poisoning attacks. Furthermore, we design MPBench -- a benchmark for evaluating memory poisoning attacks, and show that agents designed to write and retrieve memory more aggressively are more exploitable. We also show that existing prompt injection defenses fail to cover memory poisoning attacks. Our findings provide a foundation for understanding and mitigating memory poisoning attacks against AI agents.

14.2CRMay 30
Framework for Discovering GPS Spoofing Attacks in Drone Swarms

Yingao Elaine Yao, Pritam Dash, Karthik Pattabiraman

Swarm robotics, particularly drone swarms, are used in various safety-critical tasks. While a lot of attention has been given to improving swarm control algorithms for improved intelligence, the security implications of various design choices in swarm control algorithms have not been studied. We highlight how an attacker can exploit the vulnerabilities in swarm control algorithms to disrupt drone swarms. Specifically, we show that the attacker can target a swarm member (target drone) through GPS spoofing attacks, and indirectly cause other swarm members (victim drones) to veer from their course, resulting in collisions. We call these Swarm Propagation Vulnerabilities (SPVs). In this paper, we introduce two fuzzing tools, SwarmFuzzGraph and SwarmFuzzBinary, to efficiently find SPVs in swarm control algorithms. SwarmFuzzGraph uses a combination of graph theory and gradient-guided optimization to find SPVs. Our evaluation on a popular swarm control algorithm shows that SwarmFuzzGraph achieves an average success rate of 48.8% in finding SPVs. However, SwarmFuzzGraph fails to find any SPVs in drone swarms with different topologies. We then propose SwarmFuzzBinary, which uses observation-based seed scheduling and binary search to find SPVs. The evaluation shows that SwarmFuzzBinary's success rate is comparable to SwarmFuzzGraph and work in all tested algorithms.

LGJun 27, 2025
ARMOR: Robust Reinforcement Learning-based Control for UAVs under Physical Attacks

Pritam Dash, Ethan Chan, Nathan P. Lawrence et al.

Unmanned Aerial Vehicles (UAVs) depend on onboard sensors for perception, navigation, and control. However, these sensors are susceptible to physical attacks, such as GPS spoofing, that can corrupt state estimates and lead to unsafe behavior. While reinforcement learning (RL) offers adaptive control capabilities, existing safe RL methods are ineffective against such attacks. We present ARMOR (Adaptive Robust Manipulation-Optimized State Representations), an attack-resilient, model-free RL controller that enables robust UAV operation under adversarial sensor manipulation. Instead of relying on raw sensor observations, ARMOR learns a robust latent representation of the UAV's physical state via a two-stage training framework. In the first stage, a teacher encoder, trained with privileged attack information, generates attack-aware latent states for RL policy training. In the second stage, a student encoder is trained via supervised learning to approximate the teacher's latent states using only historical sensor data, enabling real-world deployment without privileged information. Our experiments show that ARMOR outperforms conventional methods, ensuring UAV safety. Additionally, ARMOR improves generalization to unseen attacks and reduces training cost by eliminating the need for iterative adversarial training.

CRAug 11, 2021
Jujutsu: A Two-stage Defense against Adversarial Patch Attacks on Deep Neural Networks

Zitao Chen, Pritam Dash, Karthik Pattabiraman

Adversarial patch attacks create adversarial examples by injecting arbitrary distortions within a bounded region of the input to fool deep neural networks (DNNs). These attacks are robust (i.e., physically-realizable) and universally malicious, and hence represent a severe security threat to real-world DNN-based systems. We propose Jujutsu, a two-stage technique to detect and mitigate robust and universal adversarial patch attacks. We first observe that adversarial patches are crafted as localized features that yield large influence on the prediction output, and continue to dominate the prediction on any input. Jujutsu leverages this observation for accurate attack detection with low false positives. Patch attacks corrupt only a localized region of the input, while the majority of the input remains unperturbed. Therefore, Jujutsu leverages generative adversarial networks (GAN) to perform localized attack recovery by synthesizing the semantic contents of the input that are corrupted by the attacks, and reconstructs a ``clean'' input for correct prediction. We evaluate Jujutsu on four diverse datasets spanning 8 different DNN models, and find that it achieves superior performance and significantly outperforms four existing defenses. We further evaluate Jujutsu against physical-world attacks, as well as adaptive attacks.