DS AI NIFeb 25, 2023

Limited Query Graph Connectivity Test

Mingyu Guo, Jialiang Li, Aneta Neumann, Frank Neumann, Hung Nguyen

arXiv:2302.13036v33.37 citationsh-index: 47

Originality Incremental advance

AI Analysis

This work addresses a cyber security need for efficiently identifying attack paths in networks, though it is incremental as it builds on Stochastic Boolean Function Evaluation with improved scalability.

The paper tackles the problem of testing connectivity in a graph with hidden edge states using limited queries, proposing a combinatorial optimization model to minimize expected queries, and demonstrates scalability to graphs with tens of thousands of edges, outperforming existing methods.

We propose a combinatorial optimisation model called Limited Query Graph Connectivity Test. We consider a graph whose edges have two possible states (On/Off). The edges' states are hidden initially. We could query an edge to reveal its state. Given a source s and a destination t, we aim to test s-t connectivity by identifying either a path (consisting of only On edges) or a cut (consisting of only Off edges). We are limited to B queries, after which we stop regardless of whether graph connectivity is established. We aim to design a query policy that minimizes the expected number of queries. Our model is mainly motivated by a cyber security use case where we need to establish whether an attack path exists in a network, between a source and a destination. Edge query is resolved by manual effort from the IT admin, which is the motivation behind query minimization. Our model is highly related to monotone Stochastic Boolean Function Evaluation (SBFE). There are two existing exact algorithms for SBFE that are prohibitively expensive. We propose a significantly more scalable exact algorithm. While previous exact algorithms only scale for trivial graphs (i.e., past works experimented on at most 20 edges), we empirically demonstrate that our algorithm is scalable for a wide range of much larger practical graphs (i.e., Windows domain network graphs with tens of thousands of edges). We propose three heuristics. Our best-performing heuristic is via reducing the search horizon of the exact algorithm. The other two are via reinforcement learning (RL) and Monte Carlo tree search (MCTS). We also derive an anytime algorithm for computing the performance lower bound. Experimentally, we show that all our heuristics are near optimal. The exact algorithm based heuristic outperforms all, surpassing RL, MCTS and 8 existing heuristics ported from SBFE and related literature.

View on arXiv PDF

Similar