cs.CRComputer Science

Cryptography & Security

Encryption, privacy, network security

36.7CRMar 16Code

How Vulnerable Are AI Agents to Indirect Prompt Injections? Insights from a Large-Scale Public Competition

Mateusz Dziemian, Maxwell Lin, Xiaohan Fu et al. · eth-zurich

This addresses a critical security threat for users of AI agents in high-stakes settings, revealing fundamental weaknesses in current models.

29.3CRMar 12

Taming OpenClaw: Security Analysis and Mitigation of Autonomous LLM Agent Threats

Xinhao Deng, Yixiang Zhang, Jiaqing Wu et al.

This addresses security risks for users and developers of autonomous LLM agents, but it is incremental as it builds on existing threat analysis frameworks.

22.6AIMar 11Code

IH-Challenge: A Training Dataset to Improve Instruction Hierarchy on Frontier LLMs

Chuan Guo, Juan Felipe Ceron Uribe, Sicheng Zhu et al.

This addresses security vulnerabilities in frontier LLMs for developers and users, though it is incremental as it builds on existing reinforcement learning and dataset methods.

21.2CRMar 11

AttriGuard: Defeating Indirect Prompt Injection in LLM Agents via Causal Attribution of Tool Invocations

Yu He, Haozhe Zhu, Yiming Li et al.

This addresses a critical security vulnerability in LLM agents for users deploying them in untrusted environments, offering a novel defense paradigm that is resilient to adaptive attacks.

22.6CRMar 18

Resource Consumption Threats in Large Language Models

Yuanhe Zhang, Xinyue Wang, Zhican Chen et al.

This is an incremental survey that addresses resource efficiency issues for LLM providers and users.

19.7CRApr 20Code9

Benchmarking Misuse Mitigation Against Covert Adversaries

Davis Brown, Mahdi Sabbaghi, Luze Sun et al.

For AI safety researchers, this work highlights a practical attack vector and provides benchmarks to evaluate defenses against covert misuse.

17.5LGMar 13Code22

PISmith: Reinforcement Learning-based Red Teaming for Prompt Injection Defenses

Chenlong Yin, Runpeng Geng, Yanting Wang et al.

This addresses security risks for real-world LLM applications, particularly autonomous agents, by providing a systematic evaluation method, though it is incremental as it builds on existing RL and red-teaming approaches.

19.6CRMar 16Code

ClawWorm: Self-Propagating Attacks Across LLM Agent Ecosystems

Yihao Zhang, Zeming Wei, Xiaokun Luan et al.

This addresses critical security risks for users of interconnected multi-agent systems, exposing vulnerabilities that could lead to autonomous attacks without attacker intervention.

19.1CRMay 16

Comprehensive Vulnerability Analysis is Necessary for Trustworthy LLM-MAS

Pengfei He, Yue Xing, Juanhui Li et al.

For researchers and practitioners building LLM-MAS, this work provides foundational groundwork for security analysis, but it is primarily a position paper without empirical results.

18.9CRMar 11

The Attack and Defense Landscape of Agentic AI: A Comprehensive Survey

Juhee Kim, Xiaoyuan Liu, Zhun Wang et al.

It addresses security problems for developers and researchers in AI agent systems, but as a survey, it is incremental in synthesizing existing knowledge rather than proposing new methods.

20.5CRMar 10Code

Reasoning-Oriented Programming: Chaining Semantic Gadgets to Jailbreak Large Vision Language Models

Quanchen Zou, Moyang Chen, Zonghao Ying et al.

This work addresses a systemic flaw in LVLM safety for users relying on secure AI systems, representing a novel attack paradigm rather than an incremental improvement.

19.8CRMar 10

CLIOPATRA: Extracting Private Information from LLM Insights

Meenatchi Sundaram Muthu Selva Annamalai, Emiliano De Cristofaro, Peter Kairouz

This work exposes critical privacy risks in widely used AI platforms, highlighting insufficient protections for user data, and is incremental as it tests existing claims rather than proposing new defenses.

18.8CRMar 19

A Framework for Formalizing LLM Agent Security

Vincent Siu, Jingxuan He, Kyle Montgomery et al.

This work provides a foundational framework for improving security in LLM agents, though it is incremental as it systematizes existing concepts rather than introducing new methods.

20.0CRMar 14Code

Sirens' Whisper: Inaudible Near-Ultrasonic Jailbreaks of Speech-Driven LLMs

Zijian Ling, Pingyi Hu, Xiuyong Gao et al.

This addresses a critical security problem for users of speech-driven LLMs by demonstrating practical, black-box attacks that are perceptually undetectable, though it is incremental in applying known acoustic techniques to a new domain.

19.1CRMar 16Code222

TrinityGuard: A Unified Framework for Safeguarding Multi-Agent Systems

Kai Wang, Biaojie Zeng, Zeming Wei et al.

This addresses safety risks for developers and users of multi-agent systems, though it appears incremental as it builds on existing standards like OWASP.

18.6CRMar 16

Personalizing Agent Privacy Decisions via Logical Entailment

James Flemings, Ren Yi, Octavian Suciu et al.

This addresses privacy concerns for users of personal LLM agents by improving decision accuracy, though it is incremental as it builds on existing logic and LLM integration approaches.

19.8CRMay 18

Agent Security is a Systems Problem

Mihai Christodorescu, Earlence Fernandes, Ashish Hooda et al.

For AI safety researchers and developers, it reframes agent security from a model-centric to a systems-centric approach, highlighting the insufficiency of model robustness alone.

11.2CLApr 29

A Survey on the Safety and Security Threats of Computer-Using Agents: JARVIS or Ultron?

Ada Chen, Yongjiang Wu, Junyuan Zhang et al. · pku, tencent-ai

For researchers and practitioners developing or deploying LLM-based autonomous agents, this survey systematizes emerging safety and security risks, offering a comprehensive reference.

23.5CRMay 12Code67

TextSeal: A Localized LLM Watermark for Provenance & Distillation Protection

Tom Sander, Hongyan Chang, Tomáš Souček et al.

For LLM developers and deployers, TextSeal provides a practical, distortion-free watermarking method that is robust to dilution and supports serving optimizations, addressing the need for provenance and distillation protection.

10.0CLApr 29

SafeReview: Defending LLM-based Review Systems Against Adversarial Hidden Prompts

Yuan Xin, Yixuan Weng, Minjun Zhu et al.

For academic peer review systems using LLMs, this work provides a dynamic defense against adversarial manipulation, though it is an incremental step in adversarial robustness.