LG AI CR SOC-PHMay 26, 2022

Dynamic Network Reconfiguration for Entropy Maximization using Deep Reinforcement Learning

Christoffel Doorman, Victor-Alexandru Darvariu, Stephen Hailes, Mirco Musolesi

arXiv:2205.13578v25.83 citationsh-index: 55Has Code

Originality Incremental advance

AI Analysis

This work addresses network reconfiguration for cybersecurity applications, offering a novel method to protect against intrusions by increasing network entropy, though it is incremental as it builds on existing deep reinforcement learning and graph neural network techniques.

The paper tackles the problem of reconfiguring networks to optimize structural properties, specifically focusing on maximizing entropy to enhance cybersecurity by scrambling network topology, and demonstrates that their deep reinforcement learning approach achieves better entropy gains than random rewiring while being computationally efficient and generalizing to larger graphs.

A key problem in network theory is how to reconfigure a graph in order to optimize a quantifiable objective. Given the ubiquity of networked systems, such work has broad practical applications in a variety of situations, ranging from drug and material design to telecommunications. The large decision space of possible reconfigurations, however, makes this problem computationally intensive. In this paper, we cast the problem of network rewiring for optimizing a specified structural property as a Markov Decision Process (MDP), in which a decision-maker is given a budget of modifications that are performed sequentially. We then propose a general approach based on the Deep Q-Network (DQN) algorithm and graph neural networks (GNNs) that can efficiently learn strategies for rewiring networks. We then discuss a cybersecurity case study, i.e., an application to the computer network reconfiguration problem for intrusion protection. In a typical scenario, an attacker might have a (partial) map of the system they plan to penetrate; if the network is effectively "scrambled", they would not be able to navigate it since their prior knowledge would become obsolete. This can be viewed as an entropy maximization problem, in which the goal is to increase the surprise of the network. Indeed, entropy acts as a proxy measurement of the difficulty of navigating the network topology. We demonstrate the general ability of the proposed method to obtain better entropy gains than random rewiring on synthetic and real-world graphs while being computationally inexpensive, as well as being able to generalize to larger graphs than those seen during training. Simulations of attack scenarios confirm the effectiveness of the learned rewiring strategies.

View on arXiv PDF Code

Similar