Ahmed Ryan

28.6SEApr 1

What Are Adversaries Doing? Automating Tactics, Techniques, and Procedures Extraction: A Systematic Review

Mahzabin Tamanna, Shaswata Mitra, Md Erfan et al.

Adversaries continuously evolve their tactics, techniques, and procedures (TTPs) to achieve their objectives while evading detection, requiring defenders to continually update their understanding of adversary behavior. Prior research has proposed automated extraction of TTP-related intelligence from unstructured text and mapping it to structured knowledge bases, such as MITRE ATT&CK. However, existing work varies widely in extraction objectives, datasets, modeling approaches, and evaluation practices, making it difficult to understand the research landscape. The goal of this study is to aid security researchers in understanding the state of the art in extracting attack tactics, techniques, and procedures (TTPs) from unstructured text by analyzing relevant literature. We systematically analyze 80 peer-reviewed studies across key dimensions: extraction purposes, data sources, dataset construction, modeling approaches, evaluation metrics, and artifact availability. Our analysis reveals several dominant trends. Technique-level classification remains the dominant task formulation, while tactic classification and technique searching are underexplored. The field has progressed from rule-based and traditional machine learning to transformer-based architectures (e.g., BERT, SecureBERT, RoBERTa), with recent studies exploring LLM-based approaches including prompting, retrieval-augmented generation, and fine-tuning, though adoption remains emergent. Despite these advances, important limitations persist: many studies rely on single-label classification, limited evaluation settings, and narrow datasets, constraining cross-domain generalization. Reproducibility is further hindered by proprietary datasets, limited code releases, and restricted corpora.

43.8SEApr 24

From Natural Language to Verified Code: Toward AI Assisted Problem-to-Code Generation with Dafny-Based Formal Verification

Md Erfan, Md Kamal Hossain Chowdhury, Ahmed Ryan et al.

Large Language Models (LLMs) show promise in automated software engineering, yet their guarantee of correctness is frequently undermined by erroneous or hallucinated code. To enforce model honesty, formal verification requires LLMs to synthesize implementation logic alongside formal specifications that are subsequently proven correct by a mathematical verifier. However, the transition from informal natural language to precise formal specification remains an arduous task. Our work addresses this by providing the NaturalLanguage2VerifiedCode (NL2VC)-60 dataset: a collection of 60 complex algorithmic problems. We evaluate 11 randomly selected problem sets across seven open-weight LLMs using a tiered prompting strategy: contextless prompts, signature prompts providing structural anchors, and self-healing prompts utilizing iterative feedback from the Dafny verifier. To address vacuous verification, where models satisfy verifiers with trivial specifications, we integrate the uDebug platform to ensure functional validation. Our results show that while contextless prompting leads to near-universal failure, structural signatures and iterative self-healing facilitate a dramatic performance turnaround. Specifically, Gemma 4-31B achieved a 90.91\% verification success rate, while GPT-OSS 120B rose from zero to 81.82\% success with signature-guided feedback. These findings indicate that formal verification is now attainable for open-weight LLMs, which serve as effective apprentices for synthesizing complex annotations and facilitating high-assurance software development.

Ahmed Ryan

2 Papers