Security Considerations for Artificial Intelligence Agents

Ninghui Li, Kaiyuan Zhang, Kyle Polley, Jerry Ma

arXiv:2603.12230v111.85 citationsh-index: 11

Predicted impact top 13% in LG · last 90 daysOriginality Synthesis-oriented

AI Analysis

This work identifies critical security risks for AI agents used by millions, but it is incremental as it builds on existing NIST principles without introducing new methods.

The paper addresses security challenges in frontier AI agents, highlighting new failure modes like indirect prompt injection and cascading failures, and recommends layered defenses and research gaps for secure multi-agent systems.

This article, a lightly adapted version of Perplexity's response to NIST/CAISI Request for Information 2025-0035, details our observations and recommendations concerning the security of frontier AI agents. These insights are informed by Perplexity's experience operating general-purpose agentic systems used by millions of users and thousands of enterprises in both controlled and open-world environments. Agent architectures change core assumptions around code-data separation, authority boundaries, and execution predictability, creating new confidentiality, integrity, and availability failure modes. We map principal attack surfaces across tools, connectors, hosting boundaries, and multi-agent coordination, with particular emphasis on indirect prompt injection, confused-deputy behavior, and cascading failures in long-running workflows. We then assess current defenses as a layered stack: input-level and model-level mitigations, sandboxed execution, and deterministic policy enforcement for high-consequence actions. Finally, we identify standards and research gaps, including adaptive security benchmarks, policy models for delegation and privilege control, and guidance for secure multi-agent system design aligned with NIST risk management principles.

View on arXiv PDF

Similar