CRAIOct 29, 2024

HonestCyberEval: An AI Cyber Risk Benchmark for Automated Software Exploitation

arXiv:2410.21939v32 citationsh-index: 8
Originality Synthesis-oriented
AI Analysis

This work addresses the need for systematic evaluation of AI cyber risks in realistic cyber offense operations, providing a foundational benchmark for the cybersecurity domain.

The paper tackles the problem of assessing AI models' capabilities and risks in automated software exploitation by introducing the HonestCyberEval benchmark, which evaluates models on detecting and exploiting vulnerabilities in real-world software like Nginx, finding that models such as o1-preview achieve a 92.85% success rate while others offer cost-effective alternatives.

We introduce HonestCyberEval, a new benchmark for assessing AI models' capabilities and risks in automated software exploitation, focusing on their ability to detect and exploit vulnerabilities in real-world software systems. Our evaluation leverages the Nginx web server repository augmented with synthetic vulnerabilities. We assess several leading language models, including OpenAI's GPT-4.5, o3-mini, o1 and o1-mini, Anthropic's Claude-3-7-sonnet-20250219, Claude-3.5-sonnet-20241022 and Claude-3.5-sonnet-20240620, Google DeepMind's Gemini-1.5-pro, and OpenAI's earlier GPT-4o model. Our findings reveal that these models vary significantly in their success rates and efficiency, with o1-preview achieving the highest success rate (92.85\%) and o3-mini and Claude-3.7-sonnet-20250219 providing cost-effective but less successful alternatives. This risk evaluation establishes a foundation for systematically evaluating the AI cyber risk in realistic cyber offence operations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes