A Receding-Horizon MDP Approach for Performance Evaluation of Moving Target Defense in Networks
This work addresses network security for systems vulnerable to multi-stage attacks, but it is incremental as it applies existing MDP methods to a specific defense scenario.
The paper tackles the problem of evaluating proactive moving target defense in networks by modeling attacker behavior with a receding-horizon MDP and analyzing how randomization frequency and detection system count affect attacker success rates in a synthetic network.
In this paper, we study the problem of assessing the effectiveness of a proactive defense-by-detection policy with a network-based moving target defense. We model the network system using a probabilistic attack graph--a graphical security model. Given a network system with a proactive defense strategy, an intelligent attacker needs to perform reconnaissance repeatedly to learn about the locations of intrusion detection systems and re-plan optimally to reach the target while avoiding detection. To compute the attacker's strategy for security evaluation, we develop a receding-horizon planning algorithm using a risk-sensitive Markov decision process with a time-varying reward function. Finally, we implement both defense and attack strategies in a synthetic network and analyze how the frequency of network randomization and the number of detection systems can influence the success rate of the attacker. This study provides insights for designing proactive defense strategies against online and multi-stage attacks by a resourceful attacker.