Adversarial Deep Reinforcement Learning based Adaptive Moving Target Defense
This work addresses the problem of proactive cybersecurity defense for systems vulnerable to attacks, though it appears incremental as it builds on an established MTD model with a novel learning approach.
The paper tackles the challenge of optimizing moving target defense (MTD) strategies against adaptive adversaries by proposing a multi-agent reinforcement learning framework based on a two-player game model, and demonstrates its effectiveness in finding optimal policies in experiments.
Moving target defense (MTD) is a proactive defense approach that aims to thwart attacks by continuously changing the attack surface of a system (e.g., changing host or network configurations), thereby increasing the adversary's uncertainty and attack cost. To maximize the impact of MTD, a defender must strategically choose when and what changes to make, taking into account both the characteristics of its system as well as the adversary's observed activities. Finding an optimal strategy for MTD presents a significant challenge, especially when facing a resourceful and determined adversary who may respond to the defender's actions. In this paper, we propose a multi-agent partially-observable Markov Decision Process model of MTD and formulate a two-player general-sum game between the adversary and the defender. Based on an established model of adaptive MTD, we propose a multi-agent reinforcement learning framework based on the double oracle algorithm to solve the game. In the experiments, we show the effectiveness of our framework in finding optimal policies.