AI CR GTJan 31, 2025

An Empirical Game-Theoretic Analysis of Autonomous Cyber-Defence Agents

Gregory Palmer, Luke Swaby, Daniel J. B. Harrold, Matthew Stewart, Alex Hiles, Chris Willis, Ian Miles, Sara Farmer

arXiv:2501.19206v15.81 citationsh-index: 3Has Code

Originality Incremental advance

AI Analysis

This work addresses the need for generalizable and assured cyber-defence policies, though it is incremental in extending existing methods.

The paper tackles the challenge of robust and resilient autonomous cyber-defence agents by introducing a potential-based reward shaping approach to expedite the computationally expensive double oracle algorithm, achieving improved efficiency in policy learning.

The recent rise in increasingly sophisticated cyber-attacks raises the need for robust and resilient autonomous cyber-defence (ACD) agents. Given the variety of cyber-attack tactics, techniques and procedures (TTPs) employed, learning approaches that can return generalisable policies are desirable. Meanwhile, the assurance of ACD agents remains an open challenge. We address both challenges via an empirical game-theoretic analysis of deep reinforcement learning (DRL) approaches for ACD using the principled double oracle (DO) algorithm. This algorithm relies on adversaries iteratively learning (approximate) best responses against each others' policies; a computationally expensive endeavour for autonomous cyber operations agents. In this work we introduce and evaluate a theoretically-sound, potential-based reward shaping approach to expedite this process. In addition, given the increasing number of open-source ACD-DRL approaches, we extend the DO formulation to allow for multiple response oracles (MRO), providing a framework for a holistic evaluation of ACD approaches.

View on arXiv PDF

Similar