45.1CRJun 4
GenTI: Benchmarking LLMs for Autonomous IDPS Rule Generation for Unseen AttacksHassan Jalil Hadi, Rehana Yasmin, Ali Shoker
Rule-based Intrusion Detection and Prevention Systems (IDPS) offer precise attack detection as well as mitigation, however their manually crafted, signature-driven rules limit adaptability to emerging and zero-day threats. Additionally, existing public datasets (e.g., CICIDS2017, UNSW-NB15) focus on traffic classification and provide little structured information to support automatic rule synthesis or prevention logic. To address this gap, we propose Generative Thread Intelligence (GenTI) \footnote{GenTI refers to the proposed framework, and GTI refers to the dataset.} an LLM-driven benchmark for automatic generation of IDPS rules targeting unseen attacks. The dataset (GTI) aggregates over 150k detection and prevention rules from Snort, Suricata, Emerging Threats, as well as 50k YARA, each annotated with protocol behavior, payload signatures, contextual relationships, mappings to Cyber Threat Intelligence (CTI), along with actionable response types (alert, drop, reject). Moreover, on top of this corpus we design an LLM-based pipeline that transforms analyst prompts and representative payloads into deployable rules via structured prompt engineering, Chain-of-Thought (CoT) reasoning, as well as a Chain-of-Verification (CoVe) loop for syntactic, semantic, and security validation. The generated rules are executed in real time on (Snort/Suricata) and evaluated by syntax accuracy, semantic similarity, CTI coverage, security effectiveness as well as unseen attacks detection. Furthermore, our GenTI instantiation achieves a composite rule-quality score of 89.4\%, with 94.8\% CTI coverage, improving unseen attacks detection from 45\% to 87.4\% and reducing the false-positive rate from 8.5\% to 2.3\%. Overall, GenTI establishes the first large-scale benchmark that tightly couples rule-level CTI with LLM-based automation, enabling adaptive, self-evolving IDPS.
14.6AIMar 16
CRASH: Cognitive Reasoning Agent for Safety Hazards in Autonomous DrivingErick Silva, Rehana Yasmin, Ali Shoker
As AVs grow in complexity and diversity, identifying the root causes of operational failures has become increasingly complex. The heterogeneity of system architectures across manufacturers, ranging from end-to-end to modular designs, together with variations in algorithms and integration strategies, limits the standardization of incident investigations and hinders systematic safety analysis. This work examines real-world AV incidents reported in the NHTSA database. We curate a dataset of 2,168 cases reported between 2021 and 2025, representing more than 80 million miles driven. To process this data, we introduce CRASH, Cognitive Reasoning Agent for Safety Hazards, an LLM-based agent that automates reasoning over crash reports by leveraging both standardized fields and unstructured narrative descriptions. CRASH operates on a unified representation of each incident to generate concise summaries, attribute a primary cause, and assess whether the AV materially contributed to the event. Our findings show that (1) CRASH attributes 64% of incidents to perception or planning failures, underscoring the importance of reasoning-based analysis for accurate fault attribution; and (2) approximately 50% of reported incidents involve rear-end collisions, highlighting a persistent and unresolved challenge in autonomous driving deployment. We further validate CRASH with five domain experts, achieving 86% accuracy in attributing AV system failures. Overall, CRASH demonstrates strong potential as a scalable and interpretable tool for automated crash analysis, providing actionable insights to support safety research and the continued development of autonomous driving systems.
27.9CRMay 7
Toward Space-Based Public Key Systems: Enabling Secure Space Communications through In-Orbit Trust ServicesRehana Yasmin, Paulo Esteves-Verissimo, Ali Shoker
The New Space era has led to a rapid increase in satellites operated by independent entities in near-Earth orbit. This shift enables richer space services but also requires secure, near-real-time coordination, making efficient authentication of space assets critical for next-generation missions. Traditional ground-dependent Public Key Infrastructure (PKI) suffers from latency and operational bottlenecks that limit scalability and availability in dynamic space environments. This paper proposes architectural designs for space-based PKI that shift certificate management and validation from ground infrastructure into space, reducing reliance on ground stations while enabling interoperability and cross-entity collaboration. Two deployment schemes are introduced: a space-ground integrated PKI with in-orbit validation authorities, and a fully autonomous space-based PKI with in-space issuance and validation. We analyze deployment trade-offs in scalability, availability, security, cost, and operational complexity in multi-operator environments. A baseline latency analysis is provided to illustrate performance implications of in-orbit trust management.
AIFeb 8, 2024
Savvy: Trustworthy Autonomous Vehicles ArchitectureAli Shoker, Rehana Yasmin, Paulo Esteves-Verissimo
The increasing interest in Autonomous Vehicles (AV) is notable due to business, safety, and performance reasons. While there is salient success in recent AV architectures, hinging on the advancements in AI models, there is a growing number of fatal incidents that impedes full AVs from going mainstream. This calls for the need to revisit the fundamentals of building safety-critical AV architectures. However, this direction should not deter leveraging the power of AI. To this end, we propose Savvy, a new trustworthy intelligent AV architecture that achieves the best of both worlds. Savvy makes a clear separation between the control plane and the data plane to guarantee the safety-first principles. The former assume control to ensure safety using design-time defined rules, while launching the latter for optimizing decisions as much as possible within safety time-bounds. This is achieved through guided Time-aware predictive quality degradation (TPQD): using dynamic ML models that can be tuned to provide either richer or faster outputs based on the available safety time bounds. For instance, Savvy allows to safely identify an elephant as an obstacle (a mere object) the earliest possible, rather than optimally recognizing it as an elephant when it is too late. This position paper presents the Savvy's motivations and concept, whereas empirical evaluation is a work in progress.