Out-of-Distribution Detection for Neurosymbolic Autonomous Cyber Agents
This work addresses the need for reliable detection of unhandled situations in neurosymbolic autonomous cyber agents, which is an incremental improvement for cybersecurity applications.
The paper tackles the problem of ensuring trustworthy autonomous cyber agents by developing an out-of-distribution monitoring algorithm using a probabilistic neural network to detect anomalous situations, with experimental results demonstrating its overall efficiency in a simulated cyber environment.
Autonomous agents for cyber applications take advantage of modern defense techniques by adopting intelligent agents with conventional and learning-enabled components. These intelligent agents are trained via reinforcement learning (RL) algorithms, and can learn, adapt to, reason about and deploy security rules to defend networked computer systems while maintaining critical operational workflows. However, the knowledge available during training about the state of the operational network and its environment may be limited. The agents should be trustworthy so that they can reliably detect situations they cannot handle, and hand them over to cyber experts. In this work, we develop an out-of-distribution (OOD) Monitoring algorithm that uses a Probabilistic Neural Network (PNN) to detect anomalous or OOD situations of RL-based agents with discrete states and discrete actions. To demonstrate the effectiveness of the proposed approach, we integrate the OOD monitoring algorithm with a neurosymbolic autonomous cyber agent that uses behavior trees with learning-enabled components. We evaluate the proposed approach in a simulated cyber environment under different adversarial strategies. Experimental results over a large number of episodes illustrate the overall efficiency of our proposed approach.