CRAIAug 17, 2025

MCPSecBench: A Systematic Security Benchmark and Playground for Testing Model Context Protocols

arXiv:2508.13220v233 citationsh-index: 5
Originality Incremental advance
AI Analysis

This work addresses security vulnerabilities in MCP for AI agents and developers, providing a standardized tool for testing, though it is incremental as it builds on existing MCP frameworks.

The paper tackles the security risks introduced by the Model Context Protocol (MCP) in LLM-based applications by presenting the first systematic taxonomy of MCP security, identifying 17 attack types, and introducing MCPSecBench, a benchmark and playground that shows over 85% of attacks successfully compromise at least one platform.

Large Language Models (LLMs) are increasingly integrated into real-world applications via the Model Context Protocol (MCP), a universal, open standard for connecting AI agents with data sources and external tools. While MCP enhances the capabilities of LLM-based agents, it also introduces new security risks and expands their attack surfaces. In this paper, we present the first systematic taxonomy of MCP security, identifying 17 attack types across 4 primary attack surfaces. We introduce MCPSecBench, a comprehensive security benchmark and playground that integrates prompt datasets, MCP servers, MCP clients, attack scripts, and protection mechanisms to evaluate these attacks across three major MCP providers. Our benchmark is modular and extensible, allowing researchers to incorporate custom implementations of clients, servers, and transport protocols for systematic security assessment. Experimental results show that over 85% of the identified attacks successfully compromise at least one platform, with core vulnerabilities universally affecting Claude, OpenAI, and Cursor, while prompt-based and tool-centric attacks exhibit considerable variability across different hosts and models. In addition, current protection mechanisms have little effect against these attacks. Overall, MCPSecBench standardizes the evaluation of MCP security and enables rigorous testing across all MCP layers.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes