CR AIMay 23, 2025

Dynamic Risk Assessments for Offensive Cybersecurity Agents

Boyi Wei, Benedikt Stroebl, Jiacen Xu, Joie Zhang, Zhou Li, Peter Henderson

Princeton

arXiv:2505.18384v513.44 citationsh-index: 8Has Code

Originality Incremental advance

AI Analysis

This addresses the risk of automated cyber-attacks for cybersecurity practitioners, but it is incremental as it builds on existing threat models by emphasizing dynamic assessments.

The paper tackles the problem of assessing cybersecurity risks of autonomous offensive agents by showing that adversaries can improve an agent's capability by over 40% on InterCode CTF with a small compute budget of 8 H100 GPU hours, highlighting the need for dynamic risk evaluations.

Foundation models are increasingly becoming better autonomous programmers, raising the prospect that they could also automate dangerous offensive cyber-operations. Current frontier model audits probe the cybersecurity risks of such agents, but most fail to account for the degrees of freedom available to adversaries in the real world. In particular, with strong verifiers and financial incentives, agents for offensive cybersecurity are amenable to iterative improvement by would-be adversaries. We argue that assessments should take into account an expanded threat model in the context of cybersecurity, emphasizing the varying degrees of freedom that an adversary may possess in stateful and non-stateful environments within a fixed compute budget. We show that even with a relatively small compute budget (8 H100 GPU Hours in our study), adversaries can improve an agent's cybersecurity capability on InterCode CTF by more than 40\% relative to the baseline -- without any external assistance. These results highlight the need to evaluate agents' cybersecurity risk in a dynamic manner, painting a more representative picture of risk.

View on arXiv PDF Code

Similar