William J. Buchanan

h-index49

28papers

8,969citations

Novelty26%

AI Score40

Ranked #74,438 of 194,257 authors (top 38%)#1,784 in CR (top 26%)

28 Papers

7.5CRJul 14

LLM vs. SAST: A Technical Analysis on Detecting Coding Bugs of GPT4-Advanced Data Analysis

Madjid G. Tehrani, Eldar Sultanow, William J. Buchanan et al.

Large language models (LLMs) are increasingly used for code understanding, yet their practical effectiveness for vulnerability detection relative to Static Application Security Testing (SAST) remains insufficiently quantified. We present a controlled comparative study between GPT-4 (Advanced Data Analysis) and two SAST tools (SonarQube and Cloud Defence) on 32 curated security scenarios representing common coding pitfalls. Each scenario is scored with a binary detection rule, the two SAST outputs are aggregated using a logical OR baseline, and paired outcomes are evaluated using McNemar's test for statistical significance. In our dataset, GPT-4 correctly detected 30 of 32 scenarios (93.75\%), while the aggregated SAST baseline detected 11 of 32. The paired comparison shows a statistically significant difference in detection performance in favour of GPT-4. We also discuss security considerations and operational constraints for integrating LLM-enhanced vulnerability scanning into secure software development workflows.

2.3CRJan 13, 2023

An Omnidirectional Approach to Touch-based Continuous Authentication

Peter Aaby, Mario Valerio Giuffrida, William J Buchanan et al.

This paper focuses on how touch interactions on smartphones can provide a continuous user authentication service through behaviour captured by a touchscreen. While efforts are made to advance touch-based behavioural authentication, researchers often focus on gathering data, tuning classifiers, and enhancing performance by evaluating touch interactions in a sequence rather than independently. However, such systems only work by providing data representing distinct behavioural traits. The typical approach separates behaviour into touch directions and creates multiple user profiles. This work presents an omnidirectional approach which outperforms the traditional method independent of the touch direction - depending on optimal behavioural features and a balanced training set. Thus, we evaluate five behavioural feature sets using the conventional approach against our direction-agnostic method while testing several classifiers, including an Extra-Tree and Gradient Boosting Classifier, which is often overlooked. Results show that in comparison with the traditional, an Extra-Trees classifier and the proposed approach are superior when combining strokes. However, the performance depends on the applied feature set. We find that the TouchAlytics feature set outperforms others when using our approach when combining three or more strokes. Finally, we highlight the importance of reporting the mean area under the curve and equal error rate for single-stroke performance and varying the sequence of strokes separately.

7.4CRApr 20

Eccfrog512ck2: An Enhanced 512-bit Weierstrass Elliptic Curve

Víctor Duarte Melo, William J. Buchanan

Whilst many key exchange and digital signature methods use the NIST P256 (secp256r1) and secp256k1 curves, there is often a demand for increased security. With these curves, we have a 128-bit security. These security levels can be increased to 256-bit security with NIST P-521 Curve 448 and Brainpool-P512. This paper outlines a new curve - Eccfrog512ck2 - and which provides 256-bit security and enhanced performance over NIST P-521. Along with this, it has side-channel resistance and is designed to avoid weaknesses such as related to the MOV attack. It shows that Eccfrog512ck2 can have a 61.5% speed-up on scalar multiplication and a 33.3% speed-up on point generation over the NIST P-521 curve.

3.6CVDec 19, 2025

Adversarial Robustness of Vision in Open Foundation Models

Jonathon Fox, William J Buchanan, Pavlos Papadopoulos

With the increase in deep learning, it becomes increasingly difficult to understand the model in which AI systems can identify objects. Thus, an adversary could aim to modify an image by adding unseen elements, which will confuse the AI in its recognition of an entity. This paper thus investigates the adversarial robustness of LLaVA-1.5-13B and Meta's Llama 3.2 Vision-8B-2. These are tested for untargeted PGD (Projected Gradient Descent) against the visual input modality, and empirically evaluated on the Visual Question Answering (VQA) v2 dataset subset. The results of these adversarial attacks are then quantified using the standard VQA accuracy metric. This evaluation is then compared with the accuracy degradation (accuracy drop) of LLaVA and Llama 3.2 Vision. A key finding is that Llama 3.2 Vision, despite a lower baseline accuracy in this setup, exhibited a smaller drop in performance under attack compared to LLaVA, particularly at higher perturbation levels. Overall, the findings confirm that the vision modality represents a viable attack vector for degrading the performance of contemporary open-weight VLMs, including Meta's Llama 3.2 Vision. Furthermore, they highlight that adversarial robustness does not necessarily correlate directly with standard benchmark performance and may be influenced by underlying architectural and training factors.

4.9CRJul 26, 2019Code

Protocol for Asynchronous, Reliable, Secure and Efficient Consensus (PARSEC) Version 2.0

Pierre Chevalier, Bartlomiej Kaminski, Fraser Hutchison et al.

In this paper we present an open source, fully asynchronous, leaderless algorithm for reaching consensus in the presence of Byzantine faults in an asynchronous network. We prove the algorithm's correctness provided that less than a third of participating nodes are faulty. We also present a way of applying the algorithm to a network with dynamic membership, i.e. a network in which nodes can join and leave at will. The core contribution of this paper is an optimal model in the definition of an asynchronous BFT protocol, and which is resilient to 1/3 byzantine nodes. This model matches an agreement with probability one (unlike some probabilistic methods), and where a common coin is used as a source of randomization so that it respects the FLP impossibility result.

2.9CRFeb 7, 2022

Ransomware: Analysing the Impact on Windows Active Directory Domain Services

Grant McDonald, Pavlos Papadopoulos, Nikolaos Pitropakis et al.

Ransomware has become an increasingly popular type of malware across the past decade and continues to rise in popularity due to its high profitability. Organisations and enterprises have become prime targets for ransomware as they are more likely to succumb to ransom demands as part of operating expenses to counter the cost incurred from downtime. Despite the prevalence of ransomware as a threat towards organisations, there is very little information outlining how ransomware affects Windows Server environments, and particularly its proprietary domain services such as Active Directory. Hence, we aim to increase the cyber situational awareness of organisations and corporations that utilise these environments. Dynamic analysis was performed using three ransomware variants to uncover how crypto-ransomware affects Windows Server-specific services and processes. Our work outlines the practical investigation undertaken as WannaCry, TeslaCrypt, and Jigsaw were acquired and tested against several domain services. The findings showed that none of the three variants stopped the processes and decidedly left all domain services untouched. However, although the services remained operational, they became uniquely dysfunctional as ransomware encrypted the files pertaining to those services

8.7CRJan 20, 2022

NapierOne: A modern mixed file data set alternative to Govdocs1

Simon R Davies, Richard Macfarlane, William J Buchanan

It was found when reviewing the ransomware detection research literature that almost no proposal provided enough detail on how the test data set was created, or sufficient description of its actual content, to allow it to be recreated by other researchers interested in reconstructing their environment and validating the research results. A modern cybersecurity mixed file data set called NapierOne is presented, primarily aimed at, but not limited to, ransomware detection and forensic analysis research. NapierOne was designed to address this deficiency in reproducibility and improve consistency by facilitating research replication and repeatability. The methodology used in the creation of this data set is also described in detail. The data set was inspired by the Govdocs1 data set and it is intended that NapierOne be used as a complement to this original data set. An investigation was performed with the goal of determining the common files types currently in use. No specific research was found that explicitly provided this information, so an alternative consensus approach was employed. This involved combining the findings from multiple sources of file type usage into an overall ranked list. After which 5000 real-world example files were gathered, and a specific data subset created, for each of the common file types identified. In some circumstances, multiple data subsets were created for a specific file type, each subset representing a specific characteristic for that file type. For example, there are multiple data subsets for the ZIP file type with each subset containing examples of a specific compression method. Ransomware execution tends to produce files that have high entropy, so examples of file types that naturally have this attribute are also present.

3.8CRDec 22, 2021

Electromagnetic Side-Channel Attack Resilience against PRESENT Lightweight Block Cipher

Nilupulee A. Gunathilake, Ahmed Al-Dubai, William J. Buchanan et al.

Lightweight cryptography is a novel diversion from conventional cryptography that targets internet-of-things (IoT) platform due to resource constraints. In comparison, it offers smaller cryptographic primitives such as shorter key sizes, block sizes and lesser energy drainage. The main focus can be seen in algorithm developments in this emerging subject. Thus, verification is carried out based upon theoretical (mathematical) proofs mostly. Among the few available side-channel analysis studies found in literature, the highest percentage is taken by power attacks. PRESENT is a promising lightweight block cipher to be included in IoT devices in the near future. Thus, the emphasis of this paper is on lightweight cryptology, and our investigation shows unavailability of a correlation electromagnetic analysis (CEMA) of it. Hence, in an effort to fill in this research gap, we opted to investigate the capabilities of CEMA against the PRESENT algorithm. This work aims to determine the probability of secret key leakage with a minimum number of electromagnetic (EM) waveforms possible. The process initially started from a simple EM analysis (SEMA) and gradually enhanced up to a CEMA. This paper presents our methodology in attack modelling, current results that indicate a probability of leaking seven bytes of the key and upcoming plans for optimisation. In addition, introductions to lightweight cryptanalysis and theories of EMA are also included.

3.8CRDec 19, 2021

Blockchain-based Platform for Secure Sharing and Validation of Vaccination Certificates

Mwrwan Abubakar, Pádraig McCarron, Zakwan Jaroucheh et al.

The COVID-19 pandemic has recently emerged as a worldwide health emergency that necessitates coordinated international measures. To contain the virus's spread, governments and health organisations raced to develop vaccines that would lower Covid-19 morbidity, relieve pressure on healthcare systems, and allow economies to open. As a way forward after the COVID-19 vaccination, the Vaccination certificate has been adopted to help the authorities formulate policies by controlling cross-border travelling. To resolve significant privacy concerns and remove the need for relying on third parties to maintain trust and control the user's data, in this paper, we leverage blockchain technologies in developing a secure and verifiable vaccination certificate. Our approach has the advantage of utilising a hybrid architecture that implements different advanced technologies, such as smart contracts, interPlanetary File System (IPFS), and Self-sovereign Identity (SSI). We will rely on verifiable credentials paired with smart contracts to implement on-chain access control decisions and provide on-chain verification and validation of the user and issuer DIDs. The usability of this approach was further analysed, particularly concerning performance and security. Our analysis proved that our approach satisfies vaccination certificate security requirements.

3.8CRDec 19, 2021

Privacy-preserving and Trusted Threat Intelligence Sharing using Distributed Ledgers

Hisham Ali, Pavlos Papadopoulos, Jawad Ahmad et al.

Threat information sharing is considered as one of the proactive defensive approaches for enhancing the overall security of trusted partners. Trusted partner organizations can provide access to past and current cybersecurity threats for reducing the risk of a potential cyberattack - the requirements for threat information sharing range from simplistic sharing of documents to threat intelligence sharing. Therefore, the storage and sharing of highly sensitive threat information raises considerable concerns regarding constructing a secure, trusted threat information exchange infrastructure. Establishing a trusted ecosystem for threat sharing will promote the validity, security, anonymity, scalability, latency efficiency, and traceability of the stored information that protects it from unauthorized disclosure. This paper proposes a system that ensures the security principles mentioned above by utilizing a distributed ledger technology that provides secure decentralized operations through smart contracts and provides a privacy-preserving ecosystem for threat information storage and sharing regarding the MITRE ATT\&CK framework.

3.8CRDec 6, 2021

PAN-DOMAIN: Privacy-preserving Sharing and Auditing of Infection Identifier Matching

William Abramson, William J. Buchanan, Sarwar Sayeed et al.

The spread of COVID-19 has highlighted the need for a robust contact tracing infrastructure that enables infected individuals to have their contacts traced, and followed up with a test. The key entities involved within a contact tracing infrastructure may include the Citizen, a Testing Centre (TC), a Health Authority (HA), and a Government Authority (GA). Typically, these different domains need to communicate with each other about an individual. A common approach is when a citizen discloses his personally identifiable information to both the HA a TC, if the test result comes positive, the information is used by the TC to alert the HA. Along with this, there can be other trusted entities that have other key elements of data related to the citizen. However, the existing approaches comprise severe flaws in terms of privacy and security. Additionally, the aforementioned approaches are not transparent and often being questioned for the efficacy of the implementations. In order to overcome the challenges, this paper outlines the PAN-DOMAIN infrastructure that allows for citizen identifiers to be matched amongst the TA, the HA and the GA. PAN-DOMAIN ensures that the citizen can keep control of the mapping between the trusted entities using a trusted converter, and has access to an audit log.

3.8CRDec 3, 2021

A Privacy-Preserving Platform for Recording COVID-19 Vaccine Passports

Masoud Barati, William J. Buchanan, Owen Lo et al.

Digital vaccine passports are one of the main solutions which would allow the restart of travel in a post COVID-19 world. Trust, scalability and security are all key challenges one must overcome in implementing a vaccine passport. Initial approaches attempt to solve this problem by using centralised systems with trusted authorities. However, sharing vaccine passport data between different organisations, regions and countries has become a major challenge. This paper designs a new platform architecture for creating, storing and verifying digital COVID-19 vaccine certifications. The platform makes use of the InterPlanetary File System (IPFS) to guarantee there is no single point of failure and allow data to be securely distributed globally. Blockchain and smart contracts are also integrated into the platform to define policies and log access rights to vaccine passport data while ensuring all actions are audited and verifiably immutable. Our proposed platform realises General Data Protection Regulation (GDPR) requirements in terms of user consent, data encryption, data erasure and accountability obligations. We assess the scalability and performance of the platform using IPFS and Blockchain test networks.

6.6CROct 5, 2021

Evaluating Tooling and Methodology when Analysing Bitcoin Mixing Services After Forensic Seizure

Edward Henry Young, Christos Chrysoulas, Nikolaos Pitropakis et al.

Little or no research has been directed to analysis and researching forensic analysis of the Bitcoin mixing or 'tumbling' service themselves. This work is intended to examine effective tooling and methodology for recovering forensic artifacts from two privacy focused mixing services namely Obscuro which uses the secure enclave on intel chips to provide enhanced confidentiality and Wasabi wallet which uses CoinJoin to mix and obfuscate crypto currencies. These wallets were set up on VMs and then several forensic tools used to examine these VM images for relevant forensic artifacts. These forensic tools were able to recover a broad range of forensic artifacts and found both network forensics and logging files to be a useful source of artifacts to deanonymize these mixing services.

6.6CRSep 17, 2021Code

GLASS: Towards Secure and Decentralized eGovernance Services using IPFS

Christos Chrysoulas, Amanda Thomson, Nikolaos Pitropakis et al.

The continuously advancing digitization has provided answers to the bureaucratic problems faced by eGovernance services. This innovation led them to an era of automation it has broadened the attack surface and made them a popular target for cyber attacks. eGovernance services utilize internet, which is currently a location addressed system where whoever controls the location controls not only the content itself, but the integrity of that content, and the access to that content. We propose GLASS, a decentralised solution which combines the InterPlanetary File System (IPFS) with Distributed Ledger technology and Smart Contracts to secure EGovernance services. We also create a testbed environment where we measure the IPFS performance.

3.8CRJun 29, 2021

Electromagnetic Analysis of an Ultra-Lightweight Cipher: PRESENT

Nilupulee A. Gunathilake, Ahmed Al-Dubai, William J. Buchanan et al.

Side-channel attacks are an unpredictable risk factor in cryptography. Therefore, continuous observations of physical leakages are essential to minimise vulnerabilities associated with cryptographic functions. Lightweight cryptography is a novel approach in progress towards internet-of-things (IoT) security. Thus, it would provide sufficient data and privacy protection in such a constrained ecosystem. IoT devices are resource-limited in terms of data rates (in kbps), power maintainability (battery) as well as hardware and software footprints (physical size, internal memory, RAM/ROM). Due to the difficulty in handling conventional cryptographic algorithms, lightweight ciphers consist of small key sizes, block sizes and few operational rounds. Unlike in the past, affordability to perform side-channel attacks using inexpensive electronic circuitries is becoming a reality. Hence, cryptanalysis of physical leakage in these emerging ciphers is crucial. Among existing studies, power analysis seems to have enough attention in research, whereas other aspects such as electromagnetic, timing, cache and optical attacks continue to be appropriately evaluated to play a role in forensic analysis. As a result, we started analysing electromagnetic emission leakage of an ultra-lightweight block cipher, PRESENT. According to the literature, PRESENT promises to be adequate for IoT devices, and there still seems not to exist any work regarding correlation electromagnetic analysis (CEMA) of it. Firstly, we conducted simple electromagnetic analysis in both time and frequency domains and then proceeded towards CEMA attack modelling. This paper provides a summary of the related literature (IoT, lightweight cryptography, side-channel attacks and EMA), our methodology, current outcomes and future plans for the optimised results.

8.8CRJun 28, 2021

Differential Area Analysis for Ransomware Attack Detection within Mixed File Datasets

Simon R Davies, Richard Macfarlane, William J Buchanan

The threat from ransomware continues to grow both in the number of affected victims as well as the cost incurred by the people and organisations impacted in a successful attack. In the majority of cases, once a victim has been attacked there remain only two courses of action open to them; either pay the ransom or lose their data. One common behaviour shared between all crypto ransomware strains is that at some point during their execution they will attempt to encrypt the users' files. Previous research Penrose et al. (2013); Zhao et al. (2011) has highlighted the difficulty in differentiating between compressed and encrypted files using Shannon entropy as both file types exhibit similar values. One of the experiments described in this paper shows a unique characteristic for the Shannon entropy of encrypted file header fragments. This characteristic was used to differentiate between encrypted files and other high entropy files such as archives. This discovery was leveraged in the development of a file classification model that used the differential area between the entropy curve of a file under analysis and one generated from random data. When comparing the entropy plot values of a file under analysis against one generated by a file containing purely random numbers, the greater the correlation of the plots is, the higher the confidence that the file under analysis contains encrypted data.

10.6LGApr 26, 2021

Launching Adversarial Attacks against Network Intrusion Detection Systems for IoT

Pavlos Papadopoulos, Oliver Thornewill von Essen, Nikolaos Pitropakis et al.

As the internet continues to be populated with new devices and emerging technologies, the attack surface grows exponentially. Technology is shifting towards a profit-driven Internet of Things market where security is an afterthought. Traditional defending approaches are no longer sufficient to detect both known and unknown attacks to high accuracy. Machine learning intrusion detection systems have proven their success in identifying unknown attacks with high precision. Nevertheless, machine learning models are also vulnerable to attacks. Adversarial examples can be used to evaluate the robustness of a designed model before it is deployed. Further, using adversarial examples is critical to creating a robust model designed for an adversarial environment. Our work evaluates both traditional machine learning and deep learning models' robustness using the Bot-IoT dataset. Our methodology included two main approaches. First, label poisoning, used to cause incorrect classification by the model. Second, the fast gradient sign method, used to evade detection measures. The experiments demonstrated that an attacker could manipulate or circumvent detection with significant probability.

17.0CRJan 10, 2021

An Experimental Analysis of Attack Classification Using Machine Learning in IoT Networks

Andrew Churcher, Rehmat Ullah, Jawad Ahmad et al.

In recent years, there has been a massive increase in the amount of Internet of Things (IoT) devices as well as the data generated by such devices. The participating devices in IoT networks can be problematic due to their resource-constrained nature, and integrating security on these devices is often overlooked. This has resulted in attackers having an increased incentive to target IoT devices. As the number of attacks possible on a network increases, it becomes more difficult for traditional intrusion detection systems (IDS) to cope with these attacks efficiently. In this paper, we highlight several machine learning (ML) methods such as k-nearest neighbour (KNN), support vector machine (SVM), decision tree (DT), naive Bayes (NB), random forest (RF), artificial neural network (ANN), and logistic regression (LR) that can be used in IDS. In this work, ML algorithms are compared for both binary and multi-class classification on Bot-IoT dataset. Based on several parameters such as accuracy, precision, recall, F1 score, and log loss, we experimentally compared the aforementioned ML algorithms. In the case of HTTP distributed denial-of-service (DDoS) attack, the accuracy of RF is 99%. Furthermore, other simulation results-based precision, recall, F1 score, and log loss metric reveal that RF outperforms on all types of attacks in binary classification. However, in multi-class classification, KNN outperforms other ML algorithms with an accuracy of 99%, which is 4% higher than RF.

11.5CRSep 10, 2020

Review and Critical Analysis of Privacy-preserving Infection Tracking and Contact Tracing

William J Buchanan, Muhammad Ali Imran, Masood Ur-Rehman et al.

The outbreak of viruses have necessitated contact tracing and infection tracking methods. Despite various efforts, there is currently no standard scheme for the tracing and tracking. Many nations of the world have therefore, developed their own ways where carriers of disease could be tracked and their contacts traced. These are generalized methods developed either in a distributed manner giving citizens control of their identity or in a centralised manner where a health authority gathers data on those who are carriers. This paper outlines some of the most significant approaches that have been established for contact tracing around the world. A comprehensive review on the key enabling methods used to realise the infrastructure around these infection tracking and contact tracing methods is also presented and recommendations are made for the most effective way to develop such a practice.

5.2CRAug 28, 2020

TRUSTD: Combat Fake Content using Blockchain and Collective Signature Technologies

Zakwan Jaroucheh, Mohamad Alissa, William J Buchanan

The growing trend of sharing news/contents, through social media platforms and the World Wide Web has been seen to impact our perception of the truth, altering our views about politics, economics, relationships, needs and wants. This is because of the growing spread of misinformation and disinformation intentionally or unintentionally by individuals and organizations. This trend has grave political, social, ethical, and privacy implications for society due to 1) the rapid developments in the field of Machine Learning (ML) and Deep Learning (DL) algorithms in creating realistic-looking yet fake digital content (such as text, images, and videos), 2) the ability to customize the content feeds and to create a polarized so-called "filter-bubbles" leveraging the availability of the big-data. Therefore, there is an ethical need to combat the flow of fake content. This paper attempts to resolve some of the aspects of this combat by presenting a high-level overview of TRUSTD, a blockchain and collective signature-based ecosystem to help content creators in getting their content backed by the community, and to help users judge on the credibility and correctness of these contents.

2.9CRFeb 12, 2020

Wi-Fi Channel Saturation as a Mechanism to Improve Passive Capture of Bluetooth Through Channel Usage Restriction

Ian Lowe, William J Buchanan, Richard J Macfarlane et al.

Bluetooth is a short-range wireless technology that provides audio and data links between personal smartphones and playback devices, such as speakers, headsets and car entertainment systems. Since its introduction in 2001, security researchers have suggested that the protocol is weak, and prone to a variety of attacks against its authentication, link management and encryption schemes. Key researchers in the field have suggested that reliable passive sniffing of Bluetooth traffic would enable the practical application of a range of currently hypothesised attacks. Restricting Bluetooth's frequency hopping behaviour by manipulation of the available channels, in order to make brute force attacks more effective has been a frequently proposed avenue of future research from the literature. This paper has evaluated the proposed approach in a series of experiments using the software defined radio tools and custom hardware developed by the Ubertooth project. The work concludes that the mechanism suggested by previous researchers may not deliver the proposed improvements, but describes an as yet undocumented interaction between Bluetooth and Wi-Fi technologies which may provide a Denial of Service attack mechanism.

2.9CRJan 22, 2020

An authentication protocol based on chaos and zero knowledge proof

Will Major, William J Buchanan, Jawad Ahmad

Port Knocking is a method for authenticating clients through a closed stance firewall, and authorising their requested actions, enabling severs to offer services to authenticated clients, without opening ports on the firewall. Advances in port knocking have resulted in an increase in complexity in design, preventing port knocking solutions from realising their potential. This paper proposes a novel port knocking solution, named Crucible, which is a secure method of authentication, with high usability and features of stealth, allowing servers and services to remain hidden and protected. Crucible is a stateless solution, only requiring the client memorise a command, the server's IP and a chosen password. The solution is forwarded as a method for protecting servers against attacks ranging from port scans, to zero-day exploitation. To act as a random oracle for both client and server, cryptographic hashes were generated through chaotic systems.

4.9CRSep 20, 2019

Performance Analysis of TLS for Quantum Robust Cryptography on a Constrained Device

Jon Barton, William J Buchanan, Nikolaos Pitropakis et al.

Advances in quantum computing make Shor's algorithm for factorising numbers ever more tractable. This threatens the security of any cryptographic system which often relies on the difficulty of factorisation. It also threatens methods based on discrete logarithms, such as with the Diffie-Hellman key exchange method. For a cryptographic system to remain secure against a quantum adversary, we need to build methods based on a hard mathematical problem, which are not susceptible to Shor's algorithm and which create Post Quantum Cryptography (PQC). While high-powered computing devices may be able to run these new methods, we need to investigate how well these methods run on limited powered devices. This paper outlines an evaluation framework for PQC within constrained devices, and contributes to the area by providing benchmarks of the front-running algorithms on a popular single-board low-power device.

4.9CRJul 27, 2019

Discovering Encrypted Bot and Ransomware Payloads Through Memory Inspection Without A Priori Knowledge

Peter McLaren, William J Buchanan, Gordon Russell et al.

Malware writers frequently try to hide the activities of their agents within tunnelled traffic. Within the Kill Chain model the infection time is often measured in seconds, and if the infection is not detected and blocked, the malware agent, such as a bot, will often then set up a secret channel to communicate with its controller. In the case of ransomware the communicated payload may include the encryption key used for the infected host to register its infection. As a malware infection can spread across a network in seconds, it is often important to detect its activities on the air, in memory and at-rest. Malware increasingly uses encrypted channels for communicating with their controllers. This paper presents a new approach to discovering the cryptographic artefacts of real malware clients that use cryptographic libraries of the Microsoft Windows operating system. This enables malware secret communications to be discovered without any prior malware knowledge.

9.7CRJul 25, 2019

Machine learning and semantic analysis of in-game chat for cyberbullying

Shane Murnion, William J. Buchanan, Adrian Smales et al.

One major problem with cyberbullying research is the lack of data, since researchers are traditionally forced to rely on survey data where victims and perpetrators self-report their impressions. In this paper, an automatic data collection system is presented that continuously collects in-game chat data from one of the most popular online multi-player games: World of Tanks. The data was collected and combined with other information about the players from available online data services. It presents a scoring scheme to enable identification of cyberbullying based on current research. Classification of the collected data was carried out using simple feature detection with SQL database queries and compared to classification from AI-based sentiment text analysis services that have recently become available and further against manually classified data using a custom-built classification client built for this paper. The simple SQL classification proved to be quite useful at identifying some features of toxic chat such as the use of bad language or racist sentiments, however the classification by the more sophisticated online sentiment analysis services proved to be disappointing. The results were then examined for insights into cyberbullying within this game and it was shown that it should be possible to reduce cyberbullying within the World of Tanks game by a significant factor by simply freezing the player's ability to communicate through the in-game chat function for a short period after the player is killed within a match. It was also shown that very new players are much less likely to engage in cyberbullying, suggesting that it may be a learned behaviour from other players.

6.8CRJul 24, 2019

Privacy Parameter Variation Using RAPPOR on a Malware Dataset

Peter Aaby, Juanjo Mata De Acuna, Richard Macfarlane et al.

Stricter data protection regulations and the poor application of privacy protection techniques have resulted in a requirement for data-driven companies to adopt new methods of analysing sensitive user data. The RAPPOR (Randomized Aggregatable Privacy-Preserving Ordinal Response) method adds parameterised noise, which must be carefully selected to maintain adequate privacy without losing analytical value. This paper applies RAPPOR privacy parameter variations against a public dataset containing a list of running Android applications data. The dataset is filtered and sampled into small (10,000); medium (100,000); and large (1,200,000) sample sizes while applying RAPPOR with ? = 10; 1.0; and 0.1 (respectively low; medium; high privacy guarantees). Also, in order to observe detailed variations within high to medium privacy guarantees (? = 0.5 to 1.0), a second experiment is conducted by progressively.

2.7CRJul 24, 2019

A Forensic Audit of the Tor Browser Bundle

Matt Muir, Petra Leimich, William J Buchanan

The increasing use of encrypted data within file storage and in network communications leaves investigators with many challenges. One of the most challenging is the Tor protocol, as its main focus is to protect the privacy of the user, in both its local footprint within a host and over a network connection. The Tor browser, though, can leave behind digital artefacts which can be used by an investigator. This paper outlines an experimental methodology and provides results for evidence trails which can be used within real-life investigations.

6.8CRJul 24, 2019

Predicting Malicious Insider Threat Scenarios Using Organizational Data and a Heterogeneous Stack-Classifier

Adam James Hall, Nikolaos Pitropakis, William J Buchanan et al.

Insider threats continue to present a major challenge for the information security community. Despite constant research taking place in this area; a substantial gap still exists between the requirements of this community and the solutions that are currently available. This paper uses the CERT dataset r4.2 along with a series of machine learning classifiers to predict the occurrence of a particular malicious insider threat scenario - the uploading sensitive information to wiki leaks before leaving the organization. These algorithms are aggregated into a meta-classifier which has a stronger predictive performance than its constituent models. It also defines a methodology for performing pre-processing on organizational log data into daily user summaries for classification, and is used to train multiple classifiers. Boosting is also applied to optimise classifier accuracy. Overall the models are evaluated through analysis of their associated confusion matrix and Receiver Operating Characteristic (ROC) curve, and the best performing classifiers are aggregated into an ensemble classifier. This meta-classifier has an accuracy of \textbf{96.2\%} with an area under the ROC curve of \textbf{0.988}.