CLMar 5, 2024Code
Alpaca against Vicuna: Using LLMs to Uncover Memorization of LLMsAly M. Kassem, Omar Mahmoud, Niloofar Mireshghallah et al. · nvidia
In this paper, we introduce a black-box prompt optimization method that uses an attacker LLM agent to uncover higher levels of memorization in a victim agent, compared to what is revealed by prompting the target model with the training data directly, which is the dominant approach of quantifying memorization in LLMs. We use an iterative rejection-sampling optimization process to find instruction-based prompts with two main characteristics: (1) minimal overlap with the training data to avoid presenting the solution directly to the model, and (2) maximal overlap between the victim model's output and the training data, aiming to induce the victim to spit out training data. We observe that our instruction-based prompts generate outputs with 23.7% higher overlap with training data compared to the baseline prefix-suffix measurements. Our findings show that (1) instruction-tuned models can expose pre-training data as much as their base-models, if not more so, (2) contexts other than the original training data can lead to leakage, and (3) using instructions proposed by other LLMs can open a new avenue of automated attacks that we should further study and explore. The code can be found at https://github.com/Alymostafa/Instruction_based_attack .
5.7LGMay 15
Context-aware Entity-Relation Extraction for Threat Intelligence Knowledge GraphsInoussa Mouiche, sherif Saad
Cybersecurity Knowledge Graphs (CKGs) unify diverse Cyber Threat Intelligence (CTI) sources into structured, queryable formats, offering scalable solutions for automating proactive and real-time security responses. Their increasing adoption has significantly enhanced the workflow and decision-making efficiency of security professionals. However, constructing CKGs requires extracting entity-relation triples from unstructured CTI reports, a task hindered by complex report structure, domain-specific language, and semantic ambiguity. As a result, existing pipeline-based approaches often suffer from error propagation, reducing extraction accuracy and limiting generalizability. This paper introduces the Context-aware Threat Intelligence Knowledge Graph (CTiKG) framework, a pipeline architecture designed to accurately extract and classify threat entities and their relationships from CTI reports. CTiKG incorporates hybrid NLP models that leverage SecureBERT+ contextual embeddings and expert knowledge from a domain ontology to reduce misclassifications and mitigate cascading errors. Experiments on the DNRTI-AUG-STIX2 dataset, which comprises 21 entity types aligned with STIX 2.1, demonstrate significant improvements over state-of-the-art baselines, yielding 3-4% gains in NER and up to 8% in RE performance, based on precision, recall, and F1-score. Additional validation on DNRTI and STUCCO benchmarks confirms the framework's robustness and practical applicability. All datasets, including the curated DNRTI-AUG-STIX2, are released on GitHub to foster reproducibility and further research.
63.9CRMay 7
Autonomous Adversary: Red-Teaming in the age of LLMMohammad Mamun, Mohamed Gaber, Scott Buffett et al.
Language Model Agents (LMAs) are emerging as a powerful primitive for augmenting red-team operations. They can support attack planning, adversary emulation, and the orchestration of multi-step activity such as lateral movement, a core enabling capability of advanced persistent threat (APT) campaigns. Using frameworks such as MITRE ATT&CK, we analyze where these agents intersect with core offensive functions and assess current strengths and limitations of LMAs with an emphasis on governance and realistic evaluation. We benchmark LMAs across two lateral-movement scenarios in a controlled adversary-emulation environment, where LMAs interact with instrumented cyber agents, observe execution artifacts, and iteratively adapt based on environmental feedback. Each scenario is formalized as an ordered task chain with explicit validation predicates, leveraging an LLM-as-a-Judge paradigm to ensure deterministic outcome verification. We compare three operational modalities: fully autonomous execution, self-scaffolded planning, and expert-defined action plans. Preliminary findings indicate that expert-defined action plans yield higher task-completion rates relative to other operational modes. However, failure remains frequent across all modalities, largely attributable to brittle command invocation, environmental and deployment instability, and recurring errors in credential management and state handling.
19.3LGMay 3
TIJERE: A Novel Threat Intelligence Joint Extraction Model Based on Analyst Expert KnowledgeInoussa Mouiche, Sherif Saad
The extraction of entities and relationships from threat intelligence reports into structured formats, such as cybersecurity knowledge graphs, is essential for automated threat analysis, detection, and mitigation. However, existing joint extraction methods struggle with feature confusion, language ambiguity, noise propagation, and overlapping relations, resulting in low accuracy and poor model performance. This paper presents TIJERE, an innovative joint entity and relation extraction framework that formulates joint extraction as a multisequence labeling representation (MSLR) problem. Specifically, separate sequences are generated for each entity pair. Unlike prior tagging schemes, MSLR integrates expert domain features to enrich positional, contextual, and semantic representations of entities, thereby enhancing feature distinction and classification accuracy. Additionally, TIJERE reduces language ambiguity and enhances domain-specific generalization by leveraging SecureBERT+, a contextual language model fine-tuned on cybersecurity text. This improves both named entity recognition (NER) and relation extraction (RE). This paper also introduces DNRTI-JE, the first publicly available jointly labeled dataset for cybersecurity entity and RE, filling a crucial gap in cyber threat intelligence automation. Empirical evaluations on the curated DNRTI-JE dataset demonstrate that TIJERE achieves state-of-the-art performance, with F1-scores exceeding 0.93 for NER and 0.98 for RE, outperforming existing methods. Together, TIJERE and the standardized benchmarking DNRTI-JE dataset enable high-performance cybersecurity intelligence extraction, with transferable applications in healthcare, finance, and bioinformatics.
CLJan 21, 2024
Finding a Needle in the Adversarial Haystack: A Targeted Paraphrasing Approach For Uncovering Edge Cases with Minimal Distribution DistortionAly M. Kassem, Sherif Saad
Adversarial attacks against language models(LMs) are a significant concern. In particular, adversarial samples exploit the model's sensitivity to small input changes. While these changes appear insignificant on the semantics of the input sample, they result in significant decay in model performance. In this paper, we propose Targeted Paraphrasing via RL (TPRL), an approach to automatically learn a policy to generate challenging samples that most likely improve the model's performance. TPRL leverages FLAN T5, a language model, as a generator and employs a self learned policy using a proximal policy gradient to generate the adversarial examples automatically. TPRL's reward is based on the confusion induced in the classifier, preserving the original text meaning through a Mutual Implication score. We demonstrate and evaluate TPRL's effectiveness in discovering natural adversarial attacks and improving model performance through extensive experiments on four diverse NLP classification tasks via Automatic and Human evaluation. TPRL outperforms strong baselines, exhibits generalizability across classifiers and datasets, and combines the strengths of language modeling and reinforcement learning to generate diverse and influential adversarial examples.
DCAug 19, 2021
Chaos Engineering For Understanding Consensus Algorithms Performance in Permissioned BlockchainsShiv Sondhi, Sherif Saad, Kevin Shi et al.
A critical component of any blockchain or distributed ledger technology (DLT) platform is the consensus algorithm. Blockchain consensus algorithms are the primary vehicle for the nodes within a blockchain network to reach an agreement. In recent years, many blockchain consensus algorithms have been proposed mainly for private and permissioned blockchain networks. However, the performance of these algorithms and their reliability in hostile environments or the presence of byzantine and other network failures are not well understood. In addition, the testing and validation of blockchain applications come with many technical challenges. In this paper, we apply chaos engineering and testing to understand the performance of consensus algorithms in the presence of different loads, byzantine failure and other communication failure scenarios. We apply chaos engineering to evaluate the performance of three different consensus algorithms (PBFT, Clique, Raft) and their respective blockchain platforms. We measure the blockchain network's throughput, latency, and success rate while executing chaos and load tests. We develop lightweight blockchain applications to execute our test in a semi-production environment. Our results show that using chaos engineering helps understand how different consensus algorithms perform in a hostile or unreliable environment and the limitations of blockchain platforms. Our work demonstrates the benefits of using chaos engineering in testing complex distributed systems such as blockchain networks.
CRJan 27, 2020
Interpreting Machine Learning Malware Detectors Which Leverage N-gram AnalysisWilliam Briguglio, Sherif Saad
In cyberattack detection and prevention systems, cybersecurity analysts always prefer solutions that are as interpretable and understandable as rule-based or signature-based detection. This is because of the need to tune and optimize these solutions to mitigate and control the effect of false positives and false negatives. Interpreting machine learning models is a new and open challenge. However, it is expected that an interpretable machine learning solution will be domain-specific. For instance, interpretable solutions for machine learning models in healthcare are different than solutions in malware detection. This is because the models are complex, and most of them work as a black-box. Recently, the increased ability for malware authors to bypass antimalware systems has forced security specialists to look to machine learning for creating robust detection systems. If these systems are to be relied on in the industry, then, among other challenges, they must also explain their predictions. The objective of this paper is to evaluate the current state-of-the-art ML models interpretability techniques when applied to ML-based malware detectors. We demonstrate interpretability techniques in practice and evaluate the effectiveness of existing interpretability techniques in the malware analysis domain.
CRNov 25, 2019
JSLess: A Tale of a Fileless Javascript Memory-Resident MalwareSherif Saad, Farhan Mahmood, William Briguglio et al.
New computing paradigms, modern feature-rich programming languages and off-the-shelf software libraries enabled the development of new sophisticated malware families. Evidence of this phenomena is the recent growth of fileless malware attacks. Fileless malware or memory resident malware is an example of an Advanced Volatile Threat (AVT). In a fileless malware attack, the malware writes itself directly onto the main memory (RAM) of the compromised device without leaving any trace on the compromised device's file system. For this reason, fileless malware presents a difficult challenge for traditional malware detection tools and in particular signature-based detection. Moreover, fileless malware forensics and reverse engineering are nearly impossible using traditional methods. The majority of fileless malware attacks in the wild take advantage of MS PowerShell, however, fileless malware are not limited to MS PowerShell. In this paper, we designed and implemented a fileless malware by taking advantage of new features in Javascript and HTML5. The proposed fileless malware could infect any device that supports Javascript and HTML5. It serves as a proof-of-concept (PoC) to demonstrate the threats of fileless malware in web applications. We used the proposed fileless malware to evaluate existing methods and techniques for malware detection in web applications. We tested the proposed fileless malware with several free and commercial malware detection tools that apply both static and dynamic analysis. The proposed fileless malware bypassed all the anti-malware detection tools included in our study. In our analysis, we discussed the limitations of existing approaches/tools and suggested possible detection and mitigation techniques.
CRMay 18, 2019
The Curious Case of Machine Learning In Malware DetectionSherif Saad, William Briguglio, Haytham Elmiligi
In this paper, we argue that machine learning techniques are not ready for malware detection in the wild. Given the current trend in malware development and the increase of unconventional malware attacks, we expect that dynamic malware analysis is the future for antimalware detection and prevention systems. A comprehensive review of machine learning for malware detection is presented. Then, we discuss how malware detection in the wild present unique challenges for the current state-of-the-art machine learning techniques. We defined three critical problems that limit the success of malware detectors powered by machine learning in the wild. Next, we discuss possible solutions to these challenges and present the requirements of next-generation malware detection. Finally, we outline potential research directions in machine learning for malware detection.