Discovering Exfiltration Paths Using Reinforcement Learning with Attack Graphs
This work addresses cybersecurity challenges for enterprise networks by inverting a previous approach to focus on exfiltration rather than compromise, though it appears incremental as it builds directly on prior crown jewels identification methods.
The paper tackles the problem of identifying optimal data exfiltration paths in enterprise networks where adversaries aim to avoid detection, using reinforcement learning with attack graphs and cyber terrain to develop reward functions, demonstrating promising performance in a sizable network environment.
Reinforcement learning (RL), in conjunction with attack graphs and cyber terrain, are used to develop reward and state associated with determination of optimal paths for exfiltration of data in enterprise networks. This work builds on previous crown jewels (CJ) identification that focused on the target goal of computing optimal paths that adversaries may traverse toward compromising CJs or hosts within their proximity. This work inverts the previous CJ approach based on the assumption that data has been stolen and now must be quietly exfiltrated from the network. RL is utilized to support the development of a reward function based on the identification of those paths where adversaries desire reduced detection. Results demonstrate promising performance for a sizable network environment.