CRApr 17, 2023
Training Automated Defense Strategies Using Graph-based Cyber Attack SimulationsJakob Nyberg, Pontus Johnson
We implemented and evaluated an automated cyber defense agent. The agent takes security alerts as input and uses reinforcement learning to learn a policy for executing predefined defensive measures. The defender policies were trained in an environment intended to simulate a cyber attack. In the simulation, an attacking agent attempts to capture targets in the environment, while the defender attempts to protect them by enabling defenses. The environment was modeled using attack graphs based on the Meta Attack Language language. We assumed that defensive measures have downtime costs, meaning that the defender agent was penalized for using them. We also assumed that the environment was equipped with an imperfect intrusion detection system that occasionally produces erroneous alerts based on the environment state. To evaluate the setup, we trained the defensive agent with different volumes of intrusion detection system noise. We also trained agents with different attacker strategies and graph sizes. In experiments, the defensive agent using policies trained with reinforcement learning outperformed agents using heuristic policies. Experiments also demonstrated that the policies could generalize across different attacker strategies. However, the performance of the learned policies decreased as the attack graphs increased in size.
AIJul 8, 2024
Structural Generalization in Autonomous Cyber Incident Response with Message-Passing Neural Networks and Reinforcement LearningJakob Nyberg, Pontus Johnson
We believe that agents for automated incident response based on machine learning need to handle changes in network structure. Computer networks are dynamic, and can naturally change in structure over time. Retraining agents for small network changes costs time and energy. We attempt to address this issue with an existing method of relational agent learning, where the relations between objects are assumed to remain consistent across problem instances. The state of the computer network is represented as a relational graph and encoded through a message passing neural network. The message passing neural network and an agent policy using the encoding are optimized end-to-end using reinforcement learning. We evaluate the approach on the second instance of the Cyber Autonomy Gym for Experimentation (CAGE~2), a cyber incident simulator that simulates attacks on an enterprise network. We create variants of the original network with different numbers of hosts and agents are tested without additional training on them. Our results show that agents using relational information are able to find solutions despite changes to the network, and can perform optimally in some instances. Agents using the default vector state representation perform better, but need to be specially trained on each network variant, demonstrating a trade-off between specialization and generalization.
LGSep 11, 2025
Vejde: A Framework for Inductive Deep Reinforcement Learning Based on Factor Graph Color RefinementJakob Nyberg, Pontus Johnson
We present and evaluate Vejde; a framework which combines data abstraction, graph neural networks and reinforcement learning to produce inductive policy functions for decision problems with richly structured states, such as object classes and relations. MDP states are represented as data bases of facts about entities, and Vejde converts each state to a bipartite graph, which is mapped to latent states through neural message passing. The factored representation of both states and actions allows Vejde agents to handle problems of varying size and structure. We tested Vejde agents on eight problem domains defined in RDDL, with ten problem instances each, where policies were trained using both supervised and reinforcement learning. To test policy generalization, we separate problem instances in two sets, one for training and the other solely for testing. Test results on unseen instances for the Vejde agents were compared to MLP agents trained on each problem instance, as well as the online planning algorithm Prost. Our results show that Vejde policies in average generalize to the test instances without a significant loss in score. Additionally, the inductive agents received scores on unseen test instances that on average were close to the instance-specific MLP agents.
CRApr 22, 2021
Research Communities in cyber security: A Comprehensive Literature ReviewSotirios Katsikeas, Pontus Johnson, Mathias Ekstedt et al.
In order to provide a coherent overview of cyber security research, the Scopus academic abstract and citation database was mined to create a citation graph of 98,373 authors active in the field between 1949 and early 2020. The Louvain community detection algorithm was applied to the graph in order to identify existing research communities. The analysis discovered twelve top-level communities: access control, authentication, biometrics, cryptography (I & II), cyber-physical systems, information hiding, intrusion detection, malwares, quantum cryptography, sensor networks, and usable security. These top-level communities were in turn composed of a total of 80 sub-communities. The analysis results are presented for each community in descriptive text, sub-community graphs, and tables with, for example, the most-cited papers and authors. A comparison between the detected communities and topical areas defined by other related work, is also presented, demonstrating a greater researcher emphasis on cryptography, quantum cryptography, information hiding and biometrics, at the expense of laws and regulation, risk management and governance, and security software lifecycle.
CRApr 22, 2021
Intrinsic Propensity for Vulnerability in Computers? Arbitrary Code Execution in the Universal Turing MachinePontus Johnson
The universal Turing machine is generally considered to be the simplest, most abstract model of a computer. This paper reports on the discovery of an accidental arbitrary code execution vulnerability in Marvin Minsky's 1967 implementation of the universal Turing machine. By submitting crafted data, the machine may be coerced into executing user-provided code. The article presents the discovered vulnerability in detail and discusses its potential implications. To the best of our knowledge, an arbitrary code execution vulnerability has not previously been reported for such a simple system.
SEFeb 18, 2018
Consensus in Software Engineering: A Cognitive Mapping StudyPontus Johnson, Paul Ralph, Mathias Ekstedt et al.
Background: Philosophers of science including Collins, Feyerabend, Kuhn and Latour have all emphasized the importance of consensus within scientific communities of practice. Consensus is important for maintaining legitimacy with outsiders, orchestrating future research, developing educational curricula and agreeing industry standards. Low consensus contrastingly undermines a field's reputation and hinders peer review. Aim: This paper aims to investigate the degree of consensus within the software engineering academic community concerning members' implicit theories of software engineering. Method: A convenience sample of 60 software engineering researchers produced diagrams describing their personal understanding of causal relationships between core software engineering constructs. The diagrams were then analyzed for patterns and clusters. Results: At least three schools of thought may be forming; however, their interpretation is unclear since they do not correspond to known divisions within the community (e.g. Agile vs. Plan-Driven methods). Furthermore, over one third of participants do not belong to any cluster. Conclusion: Although low consensus is common in social sciences, the rapid pace of innovation observed in software engineering suggests that high consensus is achievable given renewed commitment to empiricism and evidence-based practice.