Quantitative Resilience Modeling for Autonomous Cyber Defense
This work addresses the problem of measuring and improving cyber resilience for network security operators, though it appears incremental by building on existing RL frameworks.
The authors tackled the challenge of quantifying cyber resilience in networks by proposing a formal formulation that considers multiple defender goals and network resource criticality, evaluating it in the CybORG reinforcement learning environment to show that proactive hardening and prompt recovery are critical for effective defenses.
Cyber resilience is the ability of a system to recover from an attack with minimal impact on system operations. However, characterizing a network's resilience under a cyber attack is challenging, as there are no formal definitions of resilience applicable to diverse network topologies and attack patterns. In this work, we propose a quantifiable formulation of resilience that considers multiple defender operational goals, the criticality of various network resources for daily operations, and provides interpretability to security operators about their system's resilience under attack. We evaluate our approach within the CybORG environment, a reinforcement learning (RL) framework for autonomous cyber defense, analyzing trade-offs between resilience, costs, and prioritization of operational goals. Furthermore, we introduce methods to aggregate resilience metrics across time-variable attack patterns and multiple network topologies, comprehensively characterizing system resilience. Using insights gained from our resilience metrics, we design RL autonomous defensive agents and compare them against several heuristic baselines, showing that proactive network hardening techniques and prompt recovery of compromised machines are critical for effective cyber defenses.