48.6CRJun 2
High-Precision APT Malware Attribution with Out-of-Scope ResiliencePeter Williams, Adam Sobey, Erisa Karafili
Early attribution of Advanced Persistent Threat (APT) activity can help defenders prioritise investigation, select countermeasures, and reduce the impact of an intrusion. Malware provides useful attribution evidence, but automated APT malware attribution remains difficult in practice. Existing approaches are typically trained and evaluated as closed-set classifiers over a limited number of known APT groups. In operational environments, however, classifiers are likely to encounter samples from groups not represented during training. Closed-set classifiers are then forced to assign such samples to known groups, producing unsupported and potentially misleading attributions. We present a high-precision APT malware attribution method based on ranked binary classifiers with explicit abstention. Rather than training a single multi-class classifier, our approach trains and tunes two binary classifiers per APT group, ranks the classifiers by validation performance, and applies them sequentially. A sample is attributed only when a classifier provides sufficient evidence; otherwise, it abstains. We evaluate the method on the APT Malware dataset and on a larger combined dataset designed to stress-test out-of-scope behaviour. On the APT Malware dataset, the method achieves higher precision than previously published results on the same dataset. In the most challenging setting, where 87% of test samples came from 60 APT groups excluded from training, the method abstained on 94% of out-of-scope samples while maintaining 92% precision and 95% selective accuracy on the samples it classified.
CRAug 9, 2024
AttackER: Towards Enhancing Cyber-Attack Attribution with a Named Entity Recognition DatasetPritam Deka, Sampath Rajapaksha, Ruby Rani et al.
Cyber-attack attribution is an important process that allows experts to put in place attacker-oriented countermeasures and legal actions. The analysts mainly perform attribution manually, given the complex nature of this task. AI and, more specifically, Natural Language Processing (NLP) techniques can be leveraged to support cybersecurity analysts during the attribution process. However powerful these techniques are, they need to deal with the lack of datasets in the attack attribution domain. In this work, we will fill this gap and will provide, to the best of our knowledge, the first dataset on cyber-attack attribution. We designed our dataset with the primary goal of extracting attack attribution information from cybersecurity texts, utilizing named entity recognition (NER) methodologies from the field of NLP. Unlike other cybersecurity NER datasets, ours offers a rich set of annotations with contextual details, including some that span phrases and sentences. We conducted extensive experiments and applied NLP techniques to demonstrate the dataset's effectiveness for attack attribution. These experiments highlight the potential of Large Language Models (LLMs) capabilities to improve the NER tasks in cybersecurity datasets for cyber-attack attribution.
LOJul 15, 2019
Time-Stamped Claim LogicJoão Rasga, Cristina Sernadas, Erisa Karafili et al.
The main objective of this paper is to define a logic for reasoning about distributed time-stamped claims. Such a logic is interesting for theoretical reasons, i.e., as a logic per se, but also because it has a number of practical applications, in particular when one needs to reason about a huge amount of pieces of evidence collected from different sources, where some of the pieces of evidence may be contradictory and some sources are considered to be more trustworthy than others. We introduce the Time-Stamped Claim Logic including a sound and complete sequent calculus that allows one to reduce the size of the collected set of evidence and removes inconsistencies, i.e., the logic ensures that the result is consistent with respect to the trust relations considered. In order to show how Time-Stamped Claim Logic can be used in practice, we consider a concrete cyber-attribution case study.
CRApr 30, 2019
An Argumentation-Based Reasoner to Assist Digital Investigation and Attribution of Cyber-AttacksErisa Karafili, Linna Wang, Emil C. Lupu
We expect an increase in the frequency and severity of cyber-attacks that comes along with the need for efficient security countermeasures. The process of attributing a cyber-attack helps to construct efficient and targeted mitigating and preventive security measures. In this work, we propose an argumentation-based reasoner (ABR) as a proof-of-concept tool that can help a forensics analyst during the analysis of forensic evidence and the attribution process. Given the evidence collected from a cyber-attack, our reasoner can assist the analyst during the investigation process, by helping him/her to analyze the evidence and identify who performed the attack. Furthermore, it suggests to the analyst where to focus further analyses by giving hints of the missing evidence or new investigation paths to follow. ABR is the first automatic reasoner that can combine both technical and social evidence in the analysis of a cyber-attack, and that can also cope with incomplete and conflicting information. To illustrate how ABR can assist in the analysis and attribution of cyber-attacks we have used examples of cyber-attacks and their analyses as reported in publicly available reports and online literature. We do not mean to either agree or disagree with the analyses presented therein or reach attribution conclusions.
CRApr 26, 2018
A Formal Approach to Analyzing Cyber-Forensics EvidenceErisa Karafili, Matteo Cristani, Luca Viganò
The frequency and harmfulness of cyber-attacks are increasing every day, and with them also the amount of data that the cyber-forensics analysts need to collect and analyze. In this paper, we propose a formal analysis process that allows an analyst to filter the enormous amount of evidence collected and either identify crucial information about the attack (e.g., when it occurred, its culprit, its target) or, at the very least, perform a pre-analysis to reduce the complexity of the problem in order to then draw conclusions more swiftly and efficiently. We introduce the Evidence Logic EL for representing simple and derived pieces of evidence from different sources. We propose a procedure, based on monotonic reasoning, that rewrites the pieces of evidence with the use of tableau rules, based on relations of trust between sources and the reasoning behind the derived evidence, and yields a consistent set of pieces of evidence. As proof of concept, we apply our analysis process to a concrete cyber-forensics case study.
CRMay 1, 2017
Argumentation-based Security for Social GoodErisa Karafili, Antonis C. Kakas, Nikolaos I. Spanoudakis et al.
The increase of connectivity and the impact it has in every day life is raising new and existing security problems that are becoming important for social good. We introduce two particular problems: cyber attack attribution and regulatory data sharing. For both problems, decisions about which rules to apply, should be taken under incomplete and context dependent information. The solution we propose is based on argumentation reasoning, that is a well suited technique for implementing decision making mechanisms under conflicting and incomplete information. Our proposal permits us to identify the attacker of a cyber attack and decide the regulation rule that should be used while using and sharing data. We illustrate our solution through concrete examples.