CRAIMay 23, 2025

Dynamic Risk Assessments for Offensive Cybersecurity Agents

Princeton
arXiv:2505.18384v54 citationsh-index: 8
Originality Incremental advance
AI Analysis

This addresses the risk of automated cyber-attacks for cybersecurity practitioners, but it is incremental as it builds on existing threat models by emphasizing dynamic assessments.

The paper tackles the problem of assessing cybersecurity risks of autonomous offensive agents by showing that adversaries can improve an agent's capability by over 40% on InterCode CTF with a small compute budget of 8 H100 GPU hours, highlighting the need for dynamic risk evaluations.

Foundation models are increasingly becoming better autonomous programmers, raising the prospect that they could also automate dangerous offensive cyber-operations. Current frontier model audits probe the cybersecurity risks of such agents, but most fail to account for the degrees of freedom available to adversaries in the real world. In particular, with strong verifiers and financial incentives, agents for offensive cybersecurity are amenable to iterative improvement by would-be adversaries. We argue that assessments should take into account an expanded threat model in the context of cybersecurity, emphasizing the varying degrees of freedom that an adversary may possess in stateful and non-stateful environments within a fixed compute budget. We show that even with a relatively small compute budget (8 H100 GPU Hours in our study), adversaries can improve an agent's cybersecurity capability on InterCode CTF by more than 40\% relative to the baseline -- without any external assistance. These results highlight the need to evaluate agents' cybersecurity risk in a dynamic manner, painting a more representative picture of risk.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes