AICRLGMay 24, 2024

Knowledge-Informed Auto-Penetration Testing Based on Reinforcement Learning with Reward Machine

arXiv:2405.15908v110 citationsh-index: 4IJCNN
Originality Incremental advance
AI Analysis

This work addresses efficiency and interpretability issues in automated penetration testing for cybersecurity, representing an incremental improvement.

The paper tackles challenges in automated penetration testing (AutoPT) by proposing a knowledge-informed framework using reward machines to encode domain knowledge, resulting in higher training efficiency and better performance compared to agents without such knowledge.

Automated penetration testing (AutoPT) based on reinforcement learning (RL) has proven its ability to improve the efficiency of vulnerability identification in information systems. However, RL-based PT encounters several challenges, including poor sampling efficiency, intricate reward specification, and limited interpretability. To address these issues, we propose a knowledge-informed AutoPT framework called DRLRM-PT, which leverages reward machines (RMs) to encode domain knowledge as guidelines for training a PT policy. In our study, we specifically focus on lateral movement as a PT case study and formulate it as a partially observable Markov decision process (POMDP) guided by RMs. We design two RMs based on the MITRE ATT\&CK knowledge base for lateral movement. To solve the POMDP and optimize the PT policy, we employ the deep Q-learning algorithm with RM (DQRM). The experimental results demonstrate that the DQRM agent exhibits higher training efficiency in PT compared to agents without knowledge embedding. Moreover, RMs encoding more detailed domain knowledge demonstrated better PT performance compared to RMs with simpler knowledge.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes