LogGuardQ: A Cognitive-Enhanced Reinforcement Learning Framework for Cybersecurity Anomaly Detection in Security Logs
This addresses cybersecurity anomaly detection for security logs, offering a scalable approach with potential applications in intrusion detection, but it appears incremental as it builds on existing RL methods with cognitive enhancements.
The study tackled the problem of inefficient exploration and instability in reinforcement learning for cybersecurity anomaly detection by introducing LogGuardQ, a framework integrating cognitive-inspired dual-memory and adaptive exploration, achieving a 96.0% detection rate compared to 93.0% for DQN and 47.1% for PPO on a dataset with 47.9% anomalies.
Reinforcement learning (RL) has transformed sequential decision-making, but traditional algorithms like Deep Q-Networks (DQNs) and Proximal Policy Optimization (PPO) often struggle with efficient exploration, stability, and adaptability in dynamic environments. This study presents LogGuardQ (Adaptive Log Guard with Cognitive enhancement), a novel framework that integrates a dual-memory system inspired by human cognition and adaptive exploration strategies driven by temperature decay and curiosity. Evaluated on a dataset of 1,000,000 simulated access logs with 47.9% anomalies over 20,000 episodes, LogGuardQ achieves a 96.0% detection rate (versus 93.0% for DQN and 47.1% for PPO), with precision of 0.4776, recall of 0.9996, and an F1-score of 0.6450. The mean reward is 20.34 \pm 44.63 across all episodes (versus 18.80 \pm 43.98 for DQN and -0.17 \pm 23.79 for PPO), with an average of 5.0 steps per episode (constant across models). Graphical analyses, including learning curves smoothed with a Savgol filter (window=501, polynomial=2), variance trends, action distributions, and cumulative detections, demonstrate LogGuardQ's superior stability and efficiency. Statistical tests (Mann-Whitney U) confirm significant performance advantages (e.g., p = 0.0002 vs. DQN with negligible effect size, p < 0.0001 vs. PPO with medium effect size, and p < 0.0001 for DQN vs. PPO with small effect size). By bridging cognitive science and RL, LogGuardQ offers a scalable approach to adaptive learning in uncertain environments, with potential applications in cybersecurity, intrusion detection, and decision-making under uncertainty.