Darya Kaviani

CR
3papers
13citations
Novelty52%
AI Score42

3 Papers

96.6CRMay 3
Trojan Hippo: Weaponizing Agent Memory for Data Exfiltration

Debeshee Das, Julien Piet, Darya Kaviani et al.

Memory systems enable otherwise-stateless LLM agents to persist user information across sessions, but also introduce a new attack surface. We characterize the Trojan Hippo attack, a class of persistent memory attacks that operates in a more realistic threat model than prior memory poisoning work: the attacker plants a dormant payload into an agent's long-term memory via a single untrusted tool call (e.g., a crafted email), which activates only when the user later discusses sensitive topics such as finance, health, or identity, and exfiltrates high-value personal data to the attacker. While anecdotal demonstrations of such attacks have appeared against deployed systems, no prior work systematically evaluates them across heterogeneous memory architectures and defenses.We introduce a dynamic evaluation framework comprising two components: (1) an OpenEvolve-based adaptive red-teaming benchmark that stress-tests defenses and memory backends against continuously refined attacks, and (2) the first capability-aware security/utility analysis for persistent memory systems, enabling principled reasoning about defense deployment across different usage profiles. Instantiated on an email assistant across four memory backends (explicit tool memory, agentic memory, RAG, and sliding-window context), Trojan Hippo achieves up to 85-100 percent ASR against current frontier models from OpenAI and Google, with planted memories successfully activating even after 100 benign sessions. We evaluate four memory-system defenses inspired by basic security principles, finding they substantially reduce attack success rates (to as low as 0-5 percent), though at utility costs that vary widely with task requirements. Because of this substantial security-utility tradeoff, the effective real-world deployment of defenses remains an open challenge, which our evaluation framework is specifically designed to address.

97.4CRApr 2
Opal: Private Memory for Personal AI

Darya Kaviani, Alp Eren Ozdarendeli, Jinhao Zhu et al.

Personal AI systems increasingly retain long-term memory of user activity, including documents, emails, messages, meetings, and ambient recordings. Trusted hardware can keep this data private, but struggles to scale with a growing datastore. This pushes the data to external storage, which exposes retrieval access patterns that leak private information to the application provider. Oblivious RAM (ORAM) is a cryptographic primitive that can hide these patterns, but it requires a fixed access budget, precluding the query-dependent traversals that agentic memory systems rely on for accuracy. We present Opal, a private memory system for personal AI. Our key insight is to decouple all data-dependent reasoning from the bulk of personal data, confining it to the trusted enclave. Untrusted disk then sees only fixed, oblivious memory accesses. This enclave-resident component uses a lightweight knowledge graph to capture personal context that semantic search alone misses and handles continuous ingestion by piggybacking reindexing and capacity management on every ORAM access. Evaluated on a comprehensive synthetic personal-data pipeline driven by stochastic communication models, Opal improves retrieval accuracy by 13 percentage points over semantic search and achieves 29x higher throughput with 15x lower infrastructure cost than a secure baseline. Opal is under consideration for deployment to millions of users at a major AI provider.

HCNov 1, 2021
Bridging Action Frames: Instagram Infographics in U.S.Ethnic Movements

Darya Kaviani, Niloufar Salehi

Instagram infographics are a digital activism tool that have redefined action frames for technology-facilitated social movements. From the 1960s through the 1980s, United States ethnic movements practiced collective action: ideologically unified, resource-intensive traditional activism. Today, technologically enabled movements have been categorized as practicing connective action: individualized, low-resource online activism. Yet, we argue that Instagram infographics are both connective and collective. This paper juxtaposes the insights of past and present U.S. ethnic movement activists and analyzes Black Lives Matter Instagram data over the course of 7 years (2014-2020). We find that Instagram infographic activism bridges connective and collective action in three ways: (1) Scope for Education: Visually enticing and digestible infographics reduce the friction of information dissemination, facilitating collective movement education while preserving customizability. (2) Reconciliation for Credibility: Activists use connective features to combat infographic misinformation and resolve internal differences, creating a trusted collective movement front. (3) High-Resource Efforts for Transformative Change: Instagram infographic activism has been paired with boots on the ground and action-oriented content, curating a connective-to-collective pipeline that expends movement resources. Our work unveils the vitality of evaluating digital activism action frames at the movement integration level, exemplifies the powerful coexistence of connective and collective action, and offers meaningful design implications for activists seeking to leverage this novel tool.