CRJan 20, 2022
NapierOne: A modern mixed file data set alternative to Govdocs1Simon R Davies, Richard Macfarlane, William J Buchanan
It was found when reviewing the ransomware detection research literature that almost no proposal provided enough detail on how the test data set was created, or sufficient description of its actual content, to allow it to be recreated by other researchers interested in reconstructing their environment and validating the research results. A modern cybersecurity mixed file data set called NapierOne is presented, primarily aimed at, but not limited to, ransomware detection and forensic analysis research. NapierOne was designed to address this deficiency in reproducibility and improve consistency by facilitating research replication and repeatability. The methodology used in the creation of this data set is also described in detail. The data set was inspired by the Govdocs1 data set and it is intended that NapierOne be used as a complement to this original data set. An investigation was performed with the goal of determining the common files types currently in use. No specific research was found that explicitly provided this information, so an alternative consensus approach was employed. This involved combining the findings from multiple sources of file type usage into an overall ranked list. After which 5000 real-world example files were gathered, and a specific data subset created, for each of the common file types identified. In some circumstances, multiple data subsets were created for a specific file type, each subset representing a specific characteristic for that file type. For example, there are multiple data subsets for the ZIP file type with each subset containing examples of a specific compression method. Ransomware execution tends to produce files that have high entropy, so examples of file types that naturally have this attribute are also present.
CRJun 28, 2021
Differential Area Analysis for Ransomware Attack Detection within Mixed File DatasetsSimon R Davies, Richard Macfarlane, William J Buchanan
The threat from ransomware continues to grow both in the number of affected victims as well as the cost incurred by the people and organisations impacted in a successful attack. In the majority of cases, once a victim has been attacked there remain only two courses of action open to them; either pay the ransom or lose their data. One common behaviour shared between all crypto ransomware strains is that at some point during their execution they will attempt to encrypt the users' files. Previous research Penrose et al. (2013); Zhao et al. (2011) has highlighted the difficulty in differentiating between compressed and encrypted files using Shannon entropy as both file types exhibit similar values. One of the experiments described in this paper shows a unique characteristic for the Shannon entropy of encrypted file header fragments. This characteristic was used to differentiate between encrypted files and other high entropy files such as archives. This discovery was leveraged in the development of a file classification model that used the differential area between the entropy curve of a file under analysis and one generated from random data. When comparing the entropy plot values of a file under analysis against one generated by a file containing purely random numbers, the greater the correlation of the plots is, the higher the confidence that the file under analysis contains encrypted data.
CRDec 15, 2020
Evaluation of Live Forensic Techniques in Ransomware Attack MitigationSimon R. Davies, Richard Macfarlane, William J. Buchanan
Memory was captured from a system infected by ransomware and its contents was examined using live forensic tools, with the intent of identifying the symmetric encryption keys being used. NotPetya, Bad Rabbit and Phobos hybrid ransomware samples were tested during the investigation. If keys were discovered, the following two steps were also performed. Firstly, a timeline was manually created by combining data from multiple sources to illustrate the ransomware's behaviour as well as showing when the encryption keys were present in memory and how long they remained there. Secondly, an attempt was made to decrypt the files encrypted by the ransomware using the found keys. In all cases, the investigation was able to confirm that it was possible to identify the encryption keys used. A description of how these found keys were then used to successfully decrypt files that had been encrypted during the execution of the ransomware is also given. The resulting generated timelines provided a excellent way to visualise the behaviour of the ransomware and the encryption key management practices it employed, and from a forensic investigation and possible mitigation point of view, when the encryption keys are in memory.
CRJun 2, 2020
Towards Identifying Human Actions, Intent, and Severity of APT Attacks Applying Deception Techniques -- An ExperimentJoel Chacon, Sean McKeown, Richard Macfarlane
Attacks by Advanced Persistent Threats (APTs) have been shown to be difficult to detect using traditional signature- and anomaly-based intrusion detection approaches. Deception techniques such as decoy objects, often called honey items, may be deployed for intrusion detection and attack analysis, providing an alternative to detect APT behaviours. This work explores the use of honey items to classify intrusion interactions, differentiating automated attacks from those which need some human reasoning and interaction towards APT detection. Multiple decoy items are deployed on honeypots in a virtual honey network, some as breadcrumbs to detect indications of a structured manual attack. Monitoring functionality was created around Elastic Stack with a Kibana dashboard created to display interactions with various honey items. APT type manual intrusions are simulated by an experienced pentesting practitioner carrying out simulated attacks. Interactions with honey items are evaluated in order to determine their suitability for discriminating between automated tools and direct human intervention. The results show that it is possible to differentiate automatic attacks from manual structured attacks; from the nature of the interactions with the honey items. The use of honey items found in the honeypot, such as in later parts of a structured attack, have been shown to be successful in classification of manual attacks, as well as towards providing an indication of severity of the attacks
CRJul 24, 2019
Privacy Parameter Variation Using RAPPOR on a Malware DatasetPeter Aaby, Juanjo Mata De Acuna, Richard Macfarlane et al.
Stricter data protection regulations and the poor application of privacy protection techniques have resulted in a requirement for data-driven companies to adopt new methods of analysing sensitive user data. The RAPPOR (Randomized Aggregatable Privacy-Preserving Ordinal Response) method adds parameterised noise, which must be carefully selected to maintain adequate privacy without losing analytical value. This paper applies RAPPOR privacy parameter variations against a public dataset containing a list of running Android applications data. The dataset is filtered and sampled into small (10,000); medium (100,000); and large (1,200,000) sample sizes while applying RAPPOR with ? = 10; 1.0; and 0.1 (respectively low; medium; high privacy guarantees). Also, in order to observe detailed variations within high to medium privacy guarantees (? = 0.5 to 1.0), a second experiment is conducted by progressively.