Toward a Safe Internet of Agents
It offers a foundational framework for developers and researchers to engineer safe agentic systems, addressing a critical need as autonomous AI agents become interconnected.
The paper provides a principled guide to safety and security risks in the Internet of Agents (IoA), analyzing vulnerabilities across single agents, multi-agent systems, and interoperable multi-agent systems, and deriving core mitigation principles for building safe agentic AI.
Autonomous Artificial Intelligence (AI) agents, powered by Large Language Models (LLMs), advance rapidly toward interconnected systems -- an Internet of Agents (IoA). This vision enables complex problem-solving while introducing systemic safety and security risks. Beyond existing threat taxonomies, we provide a principled guide addressing architectural vulnerability sources. We offer a framework for engineering safe agentic systems through bottom-up deconstruction, analyzing each component as a dual-use interface where capability expansion creates attack surface growth. We examine three tiers: (1) Single Agents -- analyzing inherent risks in models, memory, design patterns, tools, and guardrails; (2) Multi-Agent Systems (MAS) -- examining collective behavior components including architectural patterns, communication mechanisms, verification, and system guardrails; and (3) Interoperable Multi-Agent Systems (IMAS) -- exploring four secure ecosystem pillars: standardized protocols, agent registration/discovery, resource vetting, and governance. Our analysis reveals a central principle: agentic safety must be co-designed with capability as a fundamental architectural property. We identify specific vulnerabilities at each level and derive core mitigation principles. The result is a foundational guide enabling developers and researchers to build not merely capable but safe, reliable agentic AI, contributing to secure IoA development.