CLAICRMay 29, 2021

Constructing Flow Graphs from Procedural Cybersecurity Texts

arXiv:2105.14357v1711 citationsHas Code
Originality Incremental advance
AI Analysis

This work addresses the problem of automating structure recovery from procedural texts for cybersecurity analysts, though it is incremental as it builds on existing NLP and graph-based methods.

The paper tackles the challenge of structuring procedural cybersecurity texts by proposing a method to identify relevant information and generate instruction flows, achieving improved performance with a Graph Convolution Network using BERT embeddings across multiple domains.

Following procedural texts written in natural languages is challenging. We must read the whole text to identify the relevant information or identify the instruction flows to complete a task, which is prone to failures. If such texts are structured, we can readily visualize instruction-flows, reason or infer a particular step, or even build automated systems to help novice agents achieve a goal. However, this structure recovery task is a challenge because of such texts' diverse nature. This paper proposes to identify relevant information from such texts and generate information flows between sentences. We built a large annotated procedural text dataset (CTFW) in the cybersecurity domain (3154 documents). This dataset contains valuable instructions regarding software vulnerability analysis experiences. We performed extensive experiments on CTFW with our LM-GNN model variants in multiple settings. To show the generalizability of both this task and our method, we also experimented with procedural texts from two other domains (Maintenance Manual and Cooking), which are substantially different from cybersecurity. Our experiments show that Graph Convolution Network with BERT sentence embeddings outperforms BERT in all three domains

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes