CRJan 22
TempoNet: Learning Realistic Communication and Timing Patterns for Network Traffic SimulationKristen Moore, Diksha Goel, Cody James Christopher et al.
Realistic network traffic simulation is critical for evaluating intrusion detection systems, stress-testing network protocols, and constructing high-fidelity environments for cybersecurity training. While attack traffic can often be layered into training environments using red-teaming or replay methods, generating authentic benign background traffic remains a core challenge -- particularly in simulating the complex temporal and communication dynamics of real-world networks. This paper introduces TempoNet, a novel generative model that combines multi-task learning with multi-mark temporal point processes to jointly model inter-arrival times and all packet- and flow-header fields. TempoNet captures fine-grained timing patterns and higher-order correlations such as host-pair behavior and seasonal trends, addressing key limitations of GAN-, LLM-, and Bayesian-based methods that fail to reproduce structured temporal variation. TempoNet produces temporally consistent, high-fidelity traces, validated on real-world datasets. Furthermore, we show that intrusion detection models trained on TempoNet-generated background traffic perform comparably to those trained on real data, validating its utility for real-world security applications.
CRSep 22, 2023
Enhancing Network Resilience through Machine Learning-powered Graph Combinatorial Optimization: Applications in Cyber Defense and Information DiffusionDiksha Goel
With the burgeoning advancements of computing and network communication technologies, network infrastructures and their application environments have become increasingly complex. Due to the increased complexity, networks are more prone to hardware faults and highly susceptible to cyber-attacks. Therefore, for rapidly growing network-centric applications, network resilience is essential to minimize the impact of attacks and to ensure that the network provides an acceptable level of services during attacks, faults or disruptions. In this regard, this thesis focuses on developing effective approaches for enhancing network resilience. Existing approaches for enhancing network resilience emphasize on determining bottleneck nodes and edges in the network and designing proactive responses to safeguard the network against attacks. However, existing solutions generally consider broader application domains and possess limited applicability when applied to specific application areas such as cyber defense and information diffusion, which are highly popular application domains among cyber attackers. This thesis aims to design effective, efficient and scalable techniques for discovering bottleneck nodes and edges in the network to enhance network resilience in cyber defense and information diffusion application domains. We first investigate a cyber defense graph optimization problem, i.e., hardening active directory systems by discovering bottleneck edges in the network. We then study the problem of identifying bottleneck structural hole spanner nodes, which are crucial for information diffusion in the network. We transform both problems into graph-combinatorial optimization problems and design machine learning based approaches for discovering bottleneck points vital for enhancing network resilience.
60.7CRMay 15
Unveiling the Black Box: A Multi-Layer Framework for Explaining Reinforcement Learning-Based Cyber AgentsDiksha Goel, Kristen Moore, Jeff Wang et al.
Reinforcement Learning (RL) agents are increasingly used to simulate sophisticated cyberattacks, but their decision-making processes remain opaque, hindering trust, debugging, and defensive preparedness. In high-stakes cybersecurity contexts, explainability is essential for understanding how adversarial strategies are formed and evolve over time. In this paper, we propose a unified, multi-layer explainability framework for RL-based attacker agents that reveals both strategic (Markov Decision Process (MDP)-level) and tactical (policy-level) reasoning. At the MDP-level, we model cyberattacks as a Partially Observable Markov Decision Process (POMDP) to expose exploration-exploitation dynamics and phase-aware behavioural shifts. At the policy-level, we analyse the temporal evolution of Q-values and use Prioritised Experience Replay (PER) to surface critical learning transitions and evolving action preferences. Evaluated across CyberBattleSim environments of increasing complexity, our framework offers interpretable insights into agent behaviour at scale. Unlike previous explainable RL methods, which are {predominantly} post-hoc, domain-specific, or limited in depth, our approach is both agent- and environment-agnostic, {supporting use cases such as red-team simulation, RL policy debugging, phase-aware threat modelling and anticipatory defence planning.} By transforming black-box learning into actionable behavioural intelligence, our framework enables both defenders and developers to better anticipate, analyse, and respond to autonomous cyber threats.
LGJan 29, 2025Code
CAMP in the Odyssey: Provably Robust Reinforcement Learning with Certified Radius MaximizationDerui Wang, Kristen Moore, Diksha Goel et al.
Deep reinforcement learning (DRL) has gained widespread adoption in control and decision-making tasks due to its strong performance in dynamic environments. However, DRL agents are vulnerable to noisy observations and adversarial attacks, and concerns about the adversarial robustness of DRL systems have emerged. Recent efforts have focused on addressing these robustness issues by establishing rigorous theoretical guarantees for the returns achieved by DRL agents in adversarial settings. Among these approaches, policy smoothing has proven to be an effective and scalable method for certifying the robustness of DRL agents. Nevertheless, existing certifiably robust DRL relies on policies trained with simple Gaussian augmentations, resulting in a suboptimal trade-off between certified robustness and certified return. To address this issue, we introduce a novel paradigm dubbed \texttt{C}ertified-r\texttt{A}dius-\texttt{M}aximizing \texttt{P}olicy (\texttt{CAMP}) training. \texttt{CAMP} is designed to enhance DRL policies, achieving better utility without compromising provable robustness. By leveraging the insight that the global certified radius can be derived from local certified radii based on training-time statistics, \texttt{CAMP} formulates a surrogate loss related to the local certified radius and optimizes the policy guided by this surrogate loss. We also introduce \textit{policy imitation} as a novel technique to stabilize \texttt{CAMP} training. Experimental results demonstrate that \texttt{CAMP} significantly improves the robustness-return trade-off across various tasks. Based on the results, \texttt{CAMP} can achieve up to twice the certified expected return compared to that of baselines. Our code is available at https://github.com/NeuralSec/camp-robust-rl.
61.2CRMay 3
AgenticVM: Agentic AI for Adaptive Software Vulnerability ManagementAsrul Arifin, Hussain Ahmad, Yiyao Zhang et al.
As software systems grow in scale and complexity, vulnerability management is increasingly strained by high alert volumes, fragmented toolchains, and manual triage processes. We introduce AgenticVM, a multi-agent framework that integrates large language models with security tools to automate vulnerability detection, assessment, prioritization, and reporting. AgenticVM combines rule-based processing, a BERT-based CVSS prediction module, and specialised LLM-driven agents, leveraging data from sources such as the National Vulnerability Database and the European Union Vulnerability Database. Across multiple evaluation scenarios, AgenticVM reduces raw scanner outputs into compact, actionable queues, achieving up to 98% alert reduction (e.g., from 3,983 findings to 82 high-priority items), while predicting missing CVSS attributes with 89.3% accuracy. These results demonstrate improved prioritisation efficiency and reduced analyst workload without compromising risk visibility. Beyond performance, the framework provides practical design insights into agent decomposition, tool-LLM integration, and human-in-the-loop governance for real-world deployment.
CRDec 6, 2024
ChatNVD: Advancing Cybersecurity Vulnerability Assessment with Large Language ModelsShivansh Chopra, Hussain Ahmad, Diksha Goel et al.
The increasing frequency and sophistication of cybersecurity vulnerabilities in software systems underscores the need for more robust and effective vulnerability assessment methods. However, existing approaches often rely on highly technical and abstract frameworks, which hinder understanding and increase the likelihood of exploitation, resulting in severe cyberattacks. In this paper, we introduce ChatNVD, a support tool powered by Large Language Models (LLMs) that leverages the National Vulnerability Database (NVD) to generate accessible, context-rich summaries of software vulnerabilities. We develop three variants of ChatNVD, utilizing three prominent LLMs: GPT-4o Mini by OpenAI, LLaMA 3 by Meta, and Gemini 1.5 Pro by Google. To evaluate their performance, we conduct a comparative evaluation focused on their ability to identify, interpret, and explain software vulnerabilities. Our results demonstrate that GPT-4o Mini outperforms the other models, achieving over 92% accuracy and the lowest error rates, making it the most reliable option for real-world vulnerability assessment.
CRDec 9, 2024
Machine Learning Driven Smishing Detection Framework for Mobile SecurityDiksha Goel, Hussain Ahmad, Ankit Kumar Jain et al.
The increasing reliance on smartphones for communication, financial transactions, and personal data management has made them prime targets for cyberattacks, particularly smishing, a sophisticated variant of phishing conducted via SMS. Despite the growing threat, traditional detection methods often struggle with the informal and evolving nature of SMS language, which includes abbreviations, slang, and short forms. This paper presents an enhanced content-based smishing detection framework that leverages advanced text normalization techniques to improve detection accuracy. By converting nonstandard text into its standardized form, the proposed model enhances the efficacy of machine learning classifiers, particularly the Naive Bayesian classifier, in distinguishing smishing messages from legitimate ones. Our experimental results, validated on a publicly available dataset, demonstrate a detection accuracy of 96.2%, with a low False Positive Rate of 3.87% and False Negative Rate of 2.85%. This approach significantly outperforms existing methodologies, providing a robust solution to the increasingly sophisticated threat of smishing in the mobile environment.
CLJan 8, 2025
The Future of AI: Exploring the Potential of Large Concept ModelsHussain Ahmad, Diksha Goel
The field of Artificial Intelligence (AI) continues to drive transformative innovations, with significant progress in conversational interfaces, autonomous vehicles, and intelligent content creation. Since the launch of ChatGPT in late 2022, the rise of Generative AI has marked a pivotal era, with the term Large Language Models (LLMs) becoming a ubiquitous part of daily life. LLMs have demonstrated exceptional capabilities in tasks such as text summarization, code generation, and creative writing. However, these models are inherently limited by their token-level processing, which restricts their ability to perform abstract reasoning, conceptual understanding, and efficient generation of long-form content. To address these limitations, Meta has introduced Large Concept Models (LCMs), representing a significant shift from traditional token-based frameworks. LCMs use concepts as foundational units of understanding, enabling more sophisticated semantic reasoning and context-aware decision-making. Given the limited academic research on this emerging technology, our study aims to bridge the knowledge gap by collecting, analyzing, and synthesizing existing grey literature to provide a comprehensive understanding of LCMs. Specifically, we (i) identify and describe the features that distinguish LCMs from LLMs, (ii) explore potential applications of LCMs across multiple domains, and (iii) propose future research directions and practical strategies to advance LCM development and adoption.
68.9CRApr 6
Explainable Autonomous Cyber Defense using Adversarial Multi-Agent Reinforcement LearningYiyao Zhang, Diksha Goel, Hussain Ahmad
Autonomous agents are increasingly deployed in both offensive and defensive cyber operations, creating high-speed, closed-loop interactions in critical infrastructure environments. Advanced Persistent Threat (APT) actors exploit "Living off the Land" techniques and targeted telemetry perturbations to induce ambiguity in monitoring systems, causing automated defenses to overreact or misclassify benign behavior as malicious activity. Existing monolithic and multi-agent defense pipelines largely operate on correlation-based signals, lack structural constraints on response actions, and are vulnerable to reasoning drift under ambiguous or adversarial inputs. We present the Causal Multi-Agent Decision Framework (C-MADF), a structurally constrained architecture for autonomous cyber defense that integrates causal modeling with adversarial dual-policy control. C-MADF first learns a Structural Causal Model (SCM) from historical telemetry and compiles it into an investigation-level Directed Acyclic Graph (DAG) that defines admissible response transitions. This roadmap is formalized as a Markov Decision Process (MDP) whose action space is explicitly restricted to causally consistent transitions. Decision-making within this constrained space is performed by a dual-agent reinforcement learning system in which a threat-optimizing Blue-Team policy is counterbalanced by a conservatively shaped Red-Team policy. Inter-policy disagreement is quantified through a Policy Divergence Score and exposed via a human-in-the-loop interface equipped with an Explainability-Transparency Score that serves as an escalation signal under uncertainty. On the real-world CICIoT2023 dataset, C-MADF reduces the false-positive rate from 11.2%, 9.7%, and 8.4% in three cutting-edge literature baselines to 1.8%, while achieving 0.997 precision, 0.961 recall, and 0.979 F1-score.
PMSep 14, 2025
RegimeFolio: A Regime Aware ML System for Sectoral Portfolio Optimization in Dynamic MarketsYiyao Zhang, Diksha Goel, Hussain Ahmad et al.
Financial markets are inherently non-stationary, with shifting volatility regimes that alter asset co-movements and return distributions. Standard portfolio optimization methods, typically built on stationarity or regime-agnostic assumptions, struggle to adapt to such changes. To address these challenges, we propose RegimeFolio, a novel regime-aware and sector-specialized framework that, unlike existing regime-agnostic models such as DeepVol and DRL optimizers, integrates explicit volatility regime segmentation with sector-specific ensemble forecasting and adaptive mean-variance allocation. This modular architecture ensures forecasts and portfolio decisions remain aligned with current market conditions, enhancing robustness and interpretability in dynamic markets. RegimeFolio combines three components: (i) an interpretable VIX-based classifier for market regime detection; (ii) regime and sector-specific ensemble learners (Random Forest, Gradient Boosting) to capture conditional return structures; and (iii) a dynamic mean-variance optimizer with shrinkage-regularized covariance estimates for regime-aware allocation. We evaluate RegimeFolio on 34 large cap U.S. equities from 2020 to 2024. The framework achieves a cumulative return of 137 percent, a Sharpe ratio of 1.17, a 12 percent lower maximum drawdown, and a 15 to 20 percent improvement in forecast accuracy compared to conventional and advanced machine learning benchmarks. These results show that explicitly modeling volatility regimes in predictive learning and portfolio allocation enhances robustness and leads to more dependable decision-making in real markets.
CRJun 28, 2024
Optimizing Cyber Defense in Dynamic Active Directories through Reinforcement LearningDiksha Goel, Kristen Moore, Mingyu Guo et al.
This paper addresses a significant gap in Autonomous Cyber Operations (ACO) literature: the absence of effective edge-blocking ACO strategies in dynamic, real-world networks. It specifically targets the cybersecurity vulnerabilities of organizational Active Directory (AD) systems. Unlike the existing literature on edge-blocking defenses which considers AD systems as static entities, our study counters this by recognizing their dynamic nature and developing advanced edge-blocking defenses through a Stackelberg game model between attacker and defender. We devise a Reinforcement Learning (RL)-based attack strategy and an RL-assisted Evolutionary Diversity Optimization-based defense strategy, where the attacker and defender improve each other strategy via parallel gameplay. To address the computational challenges of training attacker-defender strategies on numerous dynamic AD graphs, we propose an RL Training Facilitator that prunes environments and neural networks to eliminate irrelevant elements, enabling efficient and scalable training for large graphs. We extensively train the attacker strategy, as a sophisticated attacker model is essential for a robust defense. Our empirical results successfully demonstrate that our proposed approach enhances defender's proficiency in hardening dynamic AD graphs while ensuring scalability for large-scale AD.