CRAIFeb 23, 2025

RapidPen: Fully Automated IP-to-Shell Penetration Testing with LLM-based Agents

arXiv:2502.16730v119.319 citationsh-index: 1
Originality Incremental advance
AI Analysis

This addresses the challenge of making penetration testing more accessible and cost-efficient for organizations without dedicated security teams, though it appears incremental as it builds on existing LLM and ReAct-style methods.

The paper tackles the problem of achieving initial foothold (IP-to-Shell) in penetration testing without human intervention, and demonstrates that RapidPen achieves shell access within 200-400 seconds with a 60% success rate at a cost of $0.3-$0.6 per run.

We present RapidPen, a fully automated penetration testing (pentesting) framework that addresses the challenge of achieving an initial foothold (IP-to-Shell) without human intervention. Unlike prior approaches that focus primarily on post-exploitation or require a human-in-the-loop, RapidPen leverages large language models (LLMs) to autonomously discover and exploit vulnerabilities, starting from a single IP address. By integrating advanced ReAct-style task planning (Re) with retrieval-augmented knowledge bases of successful exploits, along with a command-generation and direct execution feedback loop (Act), RapidPen systematically scans services, identifies viable attack vectors, and executes targeted exploits in a fully automated manner. In our evaluation against a vulnerable target from the Hack The Box platform, RapidPen achieved shell access within 200-400 seconds at a per-run cost of approximately \$0.3-\$0.6, demonstrating a 60\% success rate when reusing prior "success-case" data. These results underscore the potential of truly autonomous pentesting for both security novices and seasoned professionals. Organizations without dedicated security teams can leverage RapidPen to quickly identify critical vulnerabilities, while expert pentesters can offload repetitive tasks and focus on complex challenges. Ultimately, our work aims to make penetration testing more accessible and cost-efficient, thereby enhancing the overall security posture of modern software ecosystems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes