Nikolaos Pitropakis

CR
h-index16
17papers
358citations
Novelty24%
AI Score32

17 Papers

CRDec 10, 2025
Comparative Analysis of Hash-based Malware Clustering via K-Means

Aink Acrie Soe Thein, Nikolaos Pitropakis, Pavlos Papadopoulos et al.

With the adoption of multiple digital devices in everyday life, the cyber-attack surface has increased. Adversaries are continuously exploring new avenues to exploit them and deploy malware. On the other hand, detection approaches typically employ hashing-based algorithms such as SSDeep, TLSH, and IMPHash to capture structural and behavioural similarities among binaries. This work focuses on the analysis and evaluation of these techniques for clustering malware samples using the K-means algorithm. More specifically, we experimented with established malware families and traits and found that TLSH and IMPHash produce more distinct, semantically meaningful clusters, whereas SSDeep is more efficient for broader classification tasks. The findings of this work can guide the development of more robust threat-detection mechanisms and adaptive security mechanisms.

CRFeb 7, 2022
Ransomware: Analysing the Impact on Windows Active Directory Domain Services

Grant McDonald, Pavlos Papadopoulos, Nikolaos Pitropakis et al.

Ransomware has become an increasingly popular type of malware across the past decade and continues to rise in popularity due to its high profitability. Organisations and enterprises have become prime targets for ransomware as they are more likely to succumb to ransom demands as part of operating expenses to counter the cost incurred from downtime. Despite the prevalence of ransomware as a threat towards organisations, there is very little information outlining how ransomware affects Windows Server environments, and particularly its proprietary domain services such as Active Directory. Hence, we aim to increase the cyber situational awareness of organisations and corporations that utilise these environments. Dynamic analysis was performed using three ransomware variants to uncover how crypto-ransomware affects Windows Server-specific services and processes. Our work outlines the practical investigation undertaken as WannaCry, TeslaCrypt, and Jigsaw were acquired and tested against several domain services. The findings showed that none of the three variants stopped the processes and decidedly left all domain services untouched. However, although the services remained operational, they became uniquely dysfunctional as ransomware encrypted the files pertaining to those services

CRDec 19, 2021
Privacy-preserving and Trusted Threat Intelligence Sharing using Distributed Ledgers

Hisham Ali, Pavlos Papadopoulos, Jawad Ahmad et al.

Threat information sharing is considered as one of the proactive defensive approaches for enhancing the overall security of trusted partners. Trusted partner organizations can provide access to past and current cybersecurity threats for reducing the risk of a potential cyberattack - the requirements for threat information sharing range from simplistic sharing of documents to threat intelligence sharing. Therefore, the storage and sharing of highly sensitive threat information raises considerable concerns regarding constructing a secure, trusted threat information exchange infrastructure. Establishing a trusted ecosystem for threat sharing will promote the validity, security, anonymity, scalability, latency efficiency, and traceability of the stored information that protects it from unauthorized disclosure. This paper proposes a system that ensures the security principles mentioned above by utilizing a distributed ledger technology that provides secure decentralized operations through smart contracts and provides a privacy-preserving ecosystem for threat information storage and sharing regarding the MITRE ATT\&CK framework.

CRDec 6, 2021
PAN-DOMAIN: Privacy-preserving Sharing and Auditing of Infection Identifier Matching

William Abramson, William J. Buchanan, Sarwar Sayeed et al.

The spread of COVID-19 has highlighted the need for a robust contact tracing infrastructure that enables infected individuals to have their contacts traced, and followed up with a test. The key entities involved within a contact tracing infrastructure may include the Citizen, a Testing Centre (TC), a Health Authority (HA), and a Government Authority (GA). Typically, these different domains need to communicate with each other about an individual. A common approach is when a citizen discloses his personally identifiable information to both the HA a TC, if the test result comes positive, the information is used by the TC to alert the HA. Along with this, there can be other trusted entities that have other key elements of data related to the citizen. However, the existing approaches comprise severe flaws in terms of privacy and security. Additionally, the aforementioned approaches are not transparent and often being questioned for the efficacy of the implementations. In order to overcome the challenges, this paper outlines the PAN-DOMAIN infrastructure that allows for citizen identifiers to be matched amongst the TA, the HA and the GA. PAN-DOMAIN ensures that the citizen can keep control of the mapping between the trusted entities using a trusted converter, and has access to an audit log.

CROct 5, 2021
Evaluating Tooling and Methodology when Analysing Bitcoin Mixing Services After Forensic Seizure

Edward Henry Young, Christos Chrysoulas, Nikolaos Pitropakis et al.

Little or no research has been directed to analysis and researching forensic analysis of the Bitcoin mixing or 'tumbling' service themselves. This work is intended to examine effective tooling and methodology for recovering forensic artifacts from two privacy focused mixing services namely Obscuro which uses the secure enclave on intel chips to provide enhanced confidentiality and Wasabi wallet which uses CoinJoin to mix and obfuscate crypto currencies. These wallets were set up on VMs and then several forensic tools used to examine these VM images for relevant forensic artifacts. These forensic tools were able to recover a broad range of forensic artifacts and found both network forensics and logging files to be a useful source of artifacts to deanonymize these mixing services.

CRSep 17, 2021
GLASS: Towards Secure and Decentralized eGovernance Services using IPFS

Christos Chrysoulas, Amanda Thomson, Nikolaos Pitropakis et al.

The continuously advancing digitization has provided answers to the bureaucratic problems faced by eGovernance services. This innovation led them to an era of automation it has broadened the attack surface and made them a popular target for cyber attacks. eGovernance services utilize internet, which is currently a location addressed system where whoever controls the location controls not only the content itself, but the integrity of that content, and the access to that content. We propose GLASS, a decentralised solution which combines the InterPlanetary File System (IPFS) with Distributed Ledger technology and Smart Contracts to secure EGovernance services. We also create a testbed environment where we measure the IPFS performance.

LGApr 26, 2021
Launching Adversarial Attacks against Network Intrusion Detection Systems for IoT

Pavlos Papadopoulos, Oliver Thornewill von Essen, Nikolaos Pitropakis et al.

As the internet continues to be populated with new devices and emerging technologies, the attack surface grows exponentially. Technology is shifting towards a profit-driven Internet of Things market where security is an afterthought. Traditional defending approaches are no longer sufficient to detect both known and unknown attacks to high accuracy. Machine learning intrusion detection systems have proven their success in identifying unknown attacks with high precision. Nevertheless, machine learning models are also vulnerable to attacks. Adversarial examples can be used to evaluate the robustness of a designed model before it is deployed. Further, using adversarial examples is critical to creating a robust model designed for an adversarial environment. Our work evaluates both traditional machine learning and deep learning models' robustness using the Bot-IoT dataset. Our methodology included two main approaches. First, label poisoning, used to cause incorrect classification by the model. Second, the fast gradient sign method, used to evade detection measures. The experiments demonstrated that an attacker could manipulate or circumvent detection with significant probability.

CRMar 29, 2021
Privacy and Trust Redefined in Federated Machine Learning

Pavlos Papadopoulos, Will Abramson, Adam J. Hall et al.

A common privacy issue in traditional machine learning is that data needs to be disclosed for the training procedures. In situations with highly sensitive data such as healthcare records, accessing this information is challenging and often prohibited. Luckily, privacy-preserving technologies have been developed to overcome this hurdle by distributing the computation of the training and ensuring the data privacy to their owners. The distribution of the computation to multiple participating entities introduces new privacy complications and risks. In this paper, we present a privacy-preserving decentralised workflow that facilitates trusted federated learning among participants. Our proof-of-concept defines a trust framework instantiated using decentralised identity technologies being developed under Hyperledger projects Aries/Indy/Ursa. Only entities in possession of Verifiable Credentials issued from the appropriate authorities are able to establish secure, authenticated communication channels authorised to participate in a federated learning workflow related to mental health data.

CRNov 18, 2020
A Privacy-Preserving Healthcare Framework Using Hyperledger Fabric

Charalampos Stamatellis, Pavlos Papadopoulos, Nikolaos Pitropakis et al.

Electronic health record (EHR) management systems require the adoption of effective technologies when health information is being exchanged. Current management approaches often face risks that may expose medical record storage solutions to common security attack vectors. However, healthcare-oriented blockchain solutions can provide a decentralized, anonymous and secure EHR handling approach. This paper presents PREHEALTH, a privacy-preserving EHR management solution that uses distributed ledger technology and an Identity Mixer (Idemix). The paper describes a proof-of-concept implementation that uses the Hyperledger Fabric's permissioned blockchain framework. The proposed solution is able to store patient records effectively whilst providing anonymity and unlinkability. Experimental performance evaluation results demonstrate the scheme's efficiency and feasibility for real-world scale deployment.

CRSep 10, 2020
Review and Critical Analysis of Privacy-preserving Infection Tracking and Contact Tracing

William J Buchanan, Muhammad Ali Imran, Masood Ur-Rehman et al.

The outbreak of viruses have necessitated contact tracing and infection tracking methods. Despite various efforts, there is currently no standard scheme for the tracing and tracking. Many nations of the world have therefore, developed their own ways where carriers of disease could be tracked and their contacts traced. These are generalized methods developed either in a distributed manner giving citizens control of their identity or in a centralised manner where a health authority gathers data on those who are carriers. This paper outlines some of the most significant approaches that have been established for contact tracing around the world. A comprehensive review on the key enabling methods used to realise the infrastructure around these infection tracking and contact tracing methods is also presented and recommendations are made for the most effective way to develop such a practice.

CRAug 14, 2020
Privacy Preserving Passive DNS

Pavlos Papadopoulos, Nikolaos Pitropakis, William J. Buchanan et al.

The Domain Name System (DNS) was created to resolve the IP addresses of the web servers to easily remembered names. When it was initially created, security was not a major concern; nowadays, this lack of inherent security and trust has exposed the global DNS infrastructure to malicious actors. The passive DNS data collection process creates a database containing various DNS data elements, some of which are personal and need to be protected to preserve the privacy of the end users. To this end, we propose the use of distributed ledger technology. We use Hyperledger Fabric to create a permissioned blockchain, which only authorized entities can access. The proposed solution supports queries for storing and retrieving data from the blockchain ledger, allowing the use of the passive DNS database for further analysis, e.g. for the identification of malicious domain names. Additionally, it effectively protects the DNS personal data from unauthorized entities, including the administrators that can act as potential malicious insiders, and allows only the data owners to perform queries over these data. We evaluated our proposed solution by creating a proof-of-concept experimental setup that passively collects DNS data from a network and then uses the distributed ledger technology to store the data in an immutable ledger, thus providing a full historical overview of all the records.

CYJul 27, 2020
Testing And Hardening IoT Devices Against the Mirai Botnet

Christopher Kelly, Nikolaos Pitropakis, Sean McKeown et al.

A large majority of cheap Internet of Things (IoT) devices that arrive brand new, and are configured with out-of-the-box settings, are not being properly secured by the manufactures, and are vulnerable to existing malware lurking on the Internet. Among them is the Mirai botnet which has had its source code leaked to the world, allowing any malicious actor to configure and unleash it. A combination of software assets not being utilised safely and effectively are exposing consumers to a full compromise. We configured and attacked 4 different IoT devices using the Mirai libraries. Our experiments concluded that three out of the four devices were vulnerable to the Mirai malware and became infected when deployed using their default configuration. This demonstrates that the original security configurations are not sufficient to provide acceptable levels of protection for consumers, leaving their devices exposed and vulnerable. By analysing the Mirai libraries and its attack vectors, we were able to determine appropriate device configuration countermeasures to harden the devices against this botnet, which were successfully validated through experimentation.

CRJun 3, 2020
A Distributed Trust Framework for Privacy-Preserving Machine Learning

Will Abramson, Adam James Hall, Pavlos Papadopoulos et al.

When training a machine learning model, it is standard procedure for the researcher to have full knowledge of both the data and model. However, this engenders a lack of trust between data owners and data scientists. Data owners are justifiably reluctant to relinquish control of private information to third parties. Privacy-preserving techniques distribute computation in order to ensure that data remains in the control of the owner while learning takes place. However, architectures distributed amongst multiple agents introduce an entirely new set of security and trust complications. These include data poisoning and model theft. This paper outlines a distributed infrastructure which is used to facilitate peer-to-peer trust between distributed agents; collaboratively performing a privacy-preserving workflow. Our outlined prototype sets industry gatekeepers and governance bodies as credential issuers. Before participating in the distributed learning workflow, malicious actors must first negotiate valid credentials. We detail a proof of concept using Hyperledger Aries, Decentralised Identifiers (DIDs) and Verifiable Credentials (VCs) to establish a distributed trust architecture during a privacy-preserving machine learning experiment. Specifically, we utilise secure and authenticated DID communication channels in order to facilitate a federated learning workflow related to mental health care data.

CRMay 13, 2020
Phishing URL Detection Through Top-level Domain Analysis: A Descriptive Approach

Orestis Christou, Nikolaos Pitropakis, Pavlos Papadopoulos et al.

Phishing is considered to be one of the most prevalent cyber-attacks because of its immense flexibility and alarmingly high success rate. Even with adequate training and high situational awareness, it can still be hard for users to continually be aware of the URL of the website they are visiting. Traditional detection methods rely on blocklists and content analysis, both of which require time-consuming human verification. Thus, there have been attempts focusing on the predictive filtering of such URLs. This study aims to develop a machine-learning model to detect fraudulent URLs which can be used within the Splunk platform. Inspired from similar approaches in the literature, we trained the SVM and Random Forests algorithms using malicious and benign datasets found in the literature and one dataset that we created. We evaluated the algorithms' performance with precision and recall, reaching up to 85% precision and 87% recall in the case of Random Forests while SVM achieved up to 90% precision and 88% recall using only descriptive features.

CRSep 20, 2019
Performance Analysis of TLS for Quantum Robust Cryptography on a Constrained Device

Jon Barton, William J Buchanan, Nikolaos Pitropakis et al.

Advances in quantum computing make Shor's algorithm for factorising numbers ever more tractable. This threatens the security of any cryptographic system which often relies on the difficulty of factorisation. It also threatens methods based on discrete logarithms, such as with the Diffie-Hellman key exchange method. For a cryptographic system to remain secure against a quantum adversary, we need to build methods based on a hard mathematical problem, which are not susceptible to Shor's algorithm and which create Post Quantum Cryptography (PQC). While high-powered computing devices may be able to run these new methods, we need to investigate how well these methods run on limited powered devices. This paper outlines an evaluation framework for PQC within constrained devices, and contributes to the area by providing benchmarks of the front-running algorithms on a popular single-board low-power device.

CRJul 24, 2019
Predicting Malicious Insider Threat Scenarios Using Organizational Data and a Heterogeneous Stack-Classifier

Adam James Hall, Nikolaos Pitropakis, William J Buchanan et al.

Insider threats continue to present a major challenge for the information security community. Despite constant research taking place in this area; a substantial gap still exists between the requirements of this community and the solutions that are currently available. This paper uses the CERT dataset r4.2 along with a series of machine learning classifiers to predict the occurrence of a particular malicious insider threat scenario - the uploading sensitive information to wiki leaks before leaving the organization. These algorithms are aggregated into a meta-classifier which has a stronger predictive performance than its constituent models. It also defines a methodology for performing pre-processing on organizational log data into daily user summaries for classification, and is used to train multiple classifiers. Boosting is also applied to optimise classifier accuracy. Overall the models are evaluated through analysis of their associated confusion matrix and Receiver Operating Characteristic (ROC) curve, and the best performing classifiers are aggregated into an ensemble classifier. This meta-classifier has an accuracy of \textbf{96.2\%} with an area under the ROC curve of \textbf{0.988}.

CRAug 28, 2017
Hiding in Plain Sight: A Longitudinal Study of Combosquatting Abuse

Panagiotis Kintis, Najmeh Miramirkhani, Charles Lever et al.

Domain squatting is a common adversarial practice where attackers register domain names that are purposefully similar to popular domains. In this work, we study a specific type of domain squatting called "combosquatting," in which attackers register domains that combine a popular trademark with one or more phrases (e.g., betterfacebook[.]com, youtube-live[.]com). We perform the first large-scale, empirical study of combosquatting by analyzing more than 468 billion DNS records---collected from passive and active DNS data sources over almost six years. We find that almost 60% of abusive combosquatting domains live for more than 1,000 days, and even worse, we observe increased activity associated with combosquatting year over year. Moreover, we show that combosquatting is used to perform a spectrum of different types of abuse including phishing, social engineering, affiliate abuse, trademark abuse, and even advanced persistent threats. Our results suggest that combosquatting is a real problem that requires increased scrutiny by the security community.