Deep VULMAN: A Deep Reinforcement Learning-Enabled Cyber Vulnerability Management Framework
This addresses resource-constrained vulnerability management for cybersecurity operations centers, offering a sequential decision-making approach to improve prioritization, though it is incremental as it builds on existing methods.
The authors tackled the problem of prioritizing and selecting vulnerabilities for mitigation in cybersecurity by proposing Deep VULMAN, a deep reinforcement learning and integer programming framework that outperforms current deterministic methods on simulated and real-world data over a one-year period.
Cyber vulnerability management is a critical function of a cybersecurity operations center (CSOC) that helps protect organizations against cyber-attacks on their computer and network systems. Adversaries hold an asymmetric advantage over the CSOC, as the number of deficiencies in these systems is increasing at a significantly higher rate compared to the expansion rate of the security teams to mitigate them in a resource-constrained environment. The current approaches are deterministic and one-time decision-making methods, which do not consider future uncertainties when prioritizing and selecting vulnerabilities for mitigation. These approaches are also constrained by the sub-optimal distribution of resources, providing no flexibility to adjust their response to fluctuations in vulnerability arrivals. We propose a novel framework, Deep VULMAN, consisting of a deep reinforcement learning agent and an integer programming method to fill this gap in the cyber vulnerability management process. Our sequential decision-making framework, first, determines the near-optimal amount of resources to be allocated for mitigation under uncertainty for a given system state and then determines the optimal set of prioritized vulnerability instances for mitigation. Our proposed framework outperforms the current methods in prioritizing the selection of important organization-specific vulnerabilities, on both simulated and real-world vulnerability data, observed over a one-year period.