12.3CRJun 4Code
Exploring the connection between coding habits and cognitive styles in malware developersVasilis Vouvoutsis, Constantinos Patsakis, Fran Casino
Malware research primarily studies the results, the methods, and the impact. Even from an offensive security perspective, what is examined is the method, not the development strategy of the offender. This study investigates the behavioral signatures and coding patterns embedded in the malware source code. By analyzing a large corpus of leaked malware code and comparing it with carefully selected benign open-source software, we apply static application security testing and compute multiple software metrics. Based on cognitive psychology and criminological theories, our work interprets differences in code structure and quality as behavioral indicators, reflecting distinct motivational structures, risk tolerances, and development strategies of malware authors compared to benign software developers. Our findings reveal that malware code is generally smaller, less documented, and exhibits higher cyclomatic complexity per function, with reduced use of abstraction mechanisms such as classes and closures. Vulnerability analysis further reveals that malware exhibits more issues of the types that benign code typically avoids, suggesting a minimal investment in secure development practices. These patterns imply a development style optimized for expedience, operational secrecy, and evasion rather than long-term maintainability. Nonetheless, the code quality metrics indicate that it does not deviate significantly from benign software enough to be distinctive. By framing code metrics as proxies for behavioral signals and strategic choices, we demonstrate how quantitative software analysis can enrich behavioral cybersecurity research, offering new insights into the practices and priorities of malware developers. Our results pave the way for further research in the behavioral profiling of cyber offenders.
CRMay 25, 2022
SoK: Cross-border Criminal Investigations and Digital EvidenceFran Casino, Claudia Pina, Pablo López-Aguilar et al.
Digital evidence underpin the majority of crimes as their analysis is an integral part of almost every criminal investigation. Even if we temporarily disregard the numerous challenges in the collection and analysis of digital evidence, the exchange of the evidence among the different stakeholders has many thorny issues. Of specific interest are cross-border criminal investigations as the complexity is significantly high due to the heterogeneity of legal frameworks which beyond time bottlenecks can also become prohibiting. The aim of this article is to analyse the current state of practice of cross-border investigations considering the efficacy of current collaboration protocols along with the challenges and drawbacks to be overcome. Further to performing a legally-oriented research treatise, we recall all the challenges raised in the literature and discuss them from a more practical yet global perspective. Thus, this article paves the way to enabling practitioners and stakeholders to leverage horizontal strategies to fill in the identified gaps timely and accurately.
CRAug 10, 2021
Research trends, challenges, and emerging topics of digital forensics: A review of reviewsFran Casino, Tom Dasaklis, Georgios Spathoulas et al.
Due to its critical role in cybersecurity, digital forensics has received significant attention from researchers and practitioners alike. The ever increasing sophistication of modern cyberattacks is directly related to the complexity of evidence acquisition, which often requires the use of several technologies. To date, researchers have presented many surveys and reviews on the field. However, such articles focused on the advances of each particular domain of digital forensics individually. Therefore, while each of these surveys facilitates researchers and practitioners to keep up with the latest advances in a particular domain of digital forensics, the global perspective is missing. Aiming to fill this gap, we performed a qualitative review of reviews in the field of digital forensics, determined the main topics on digital forensics topics and identified their main challenges. Our analysis provides enough evidence to prove that the digital forensics community could benefit from closer collaborations and cross-topic research, since it is apparent that researchers and practitioners are trying to find solutions to the same problems in parallel, sometimes without noticing it.
CRMay 25, 2021
The Cynicism of Modern Cybercrime: Automating the Analysis of Surface Web MarketplacesNikolaos Lykousas, Vasilios Koutsokostas, Fran Casino et al.
Cybercrime is continuously growing in numbers and becoming more sophisticated. Currently, there are various monetisation and money laundering methods, creating a huge, underground economy worldwide. A clear indicator of these activities is online marketplaces which allow cybercriminals to trade their stolen assets and services. While traditionally these marketplaces are available through the dark web, several of them have emerged in the surface web. In this work, we perform a longitudinal analysis of a surface web marketplace. The information was collected through targeted web scrapping that allowed us to identify hundreds of merchants' profiles for the most widely used surface web marketplaces. In this regard, we discuss the products traded in these markets, their prices, their availability, and the exchange currency. This analysis is performed in an automated way through a machine learning-based pipeline, allowing us to quickly and accurately extract the needed information. The outcomes of our analysis evince that illegal practices are leveraged in surface marketplaces and that there are not effective mechanisms towards their takedown at the time of writing.
CRApr 12, 2021
EtherClue: Digital investigation of attacks on Ethereum smart contractsSimon Joseph Aquilina, Fran Casino, Mark Vella et al.
Programming errors in Ethereum smart contracts can result in catastrophic financial losses from stolen cryptocurrency. While vulnerability detectors can prevent vulnerable contracts from being deployed, this does not mean that such contracts will not be deployed. Once a vulnerable contract is instantiated on the blockchain and becomes the target of attacks, the identification of exploit transactions becomes indispensable in assessing whether it has been actually exploited and identifying which malicious or subverted accounts were involved. In this work, we study the problem of post-factum investigation of Ethereum attacks using Indicators of Compromise (IoCs) specially crafted for use in the blockchain. IoC definitions need to capture the side-effects of successful exploitation in the context of the Ethereum blockchain. Therefore, we define a model for smart contract execution, comprising multiple abstraction levels that mirror the multiple views of code execution on a blockchain. Subsequently, we compare IoCs defined across the different levels in terms of their effectiveness and practicality through EtherClue, a prototype tool for investigating Ethereum security incidents. Our results illustrate that coarse-grained IoCs defined over blocks of transactions can detect exploit transactions with less computation; however, they are contract-specific and suffer from false negatives. On the other hand, fine-grained IoCs defined over virtual machine instructions can avoid these pitfalls at the expense of increased computation which are nevertheless applicable for practical use.
CRMar 30, 2021
Analysis and Correlation of Visual Evidence in Campaigns of Malicious Office DocumentsFran Casino, Nikolaos Totosis, Theodoros Apostolopoulos et al.
Many malware campaigns use Microsoft (MS) Office documents as droppers to download and execute their malicious payload. Such campaigns often use these documents because MS Office is installed in billions of devices and that these files allow the execution of arbitrary VBA code. Recent versions of MS Office prevent the automatic execution of VBA macros, so malware authors try to convince users into enabling the content via images that, e.g. forge system or technical errors. In this work, we leverage these visual elements to construct lightweight malware signatures that can be applied with minimal effort. We test and validate our approach using an extensive database of malware samples and identify correlations between different campaigns that illustrate that some campaigns are either using the same tools or that there is some collaboration between them.
CRAug 18, 2020
A blockchain-based Forensic Model for Financial Crime Investigation: The Embezzlement ScenarioLamprini Zarpala, Fran Casino
The financial crime landscape is evolving along with the digitization in financial services. In this context, laws and regulations cannot efficiently cope with a fast-moving industry such as finance, which translates in late adoption of measures and legal voids, providing a fruitful landscape for malicious actors. In parallel, blockchain technology and its promising features such as immutability, verifiability, and authentication, enhance the opportunities of financial forensics. In this paper, we focus on an embezzlement scheme and we provide a forensic-by-design methodology for its investigation. In addition, the feasibility and adaptability of our approach can be extended and embrace digital investigations on other types of schemes. We provide a functional implementation based on smart contracts and we integrate standardised forensic flows and chain of custody preservation mechanisms. Finally, we discuss the benefits and challenges of the symbiotic relationship between blockchain and financial investigations, along with future research directions.
CRAug 6, 2020
Intercepting Hail Hydra: Real-Time Detection of Algorithmically Generated DomainsFran Casino, Nikolaos Lykousas, Ivan Homoliak et al.
A crucial technical challenge for cybercriminals is to keep control over the potentially millions of infected devices that build up their botnets, without compromising the robustness of their attacks. A single, fixed C&C server, for example, can be trivially detected either by binary or traffic analysis and immediately sink-holed or taken-down by security researchers or law enforcement. Botnets often use Domain Generation Algorithms (DGAs), primarily to evade take-down attempts. DGAs can enlarge the lifespan of a malware campaign, thus potentially enhancing its profitability. They can also contribute to hindering attack accountability. In this work, we introduce HYDRAS, the most comprehensive and representative dataset of Algorithmically-Generated Domains (AGD) available to date. The dataset contains more than 100 DGA families, including both real-world and adversarially designed ones. We analyse the dataset and discuss the possibility of differentiating between benign requests (to real domains) and malicious ones (to AGDs) in real-time. The simultaneous study of so many families and variants introduces several challenges; nonetheless, it alleviates biases found in previous literature employing small datasets which are frequently overfitted, exploiting characteristic features of particular families that do not generalise well.We thoroughly compare our approach with the current state-of-the-art and highlight some methodological shortcomings in the actual state of practice. The outcomes obtained show that our proposed approach significantly outperforms the current state-of-the-art in terms of both classification performance and efficiency.
CRMay 26, 2020
SoK: Blockchain Solutions for ForensicsThomas K. Dasaklis, Fran Casino, Constantinos Patsakis
As the digitization of information-intensive processes gains momentum in nowadays, the concern is growing about how to deal with the ever-growing problem of cybercrime. To this end, law enforcement officials and security firms use sophisticated digital forensics techniques for analyzing and investigating cybercrimes. However, multi-jurisdictional mandates, interoperability issues, the massive amount of evidence gathered (multimedia, text etc.) and multiple stakeholders involved (law enforcement agencies, security firms etc.) are just a few among the various challenges that hinder the adoption and implementation of sound digital forensics schemes. Blockchain technology has been recently proposed as a viable solution for developing robust digital forensics mechanisms. In this paper, we provide an overview and classification of the available blockchain-based digital forensic tools, and we further describe their main features. We also offer a thorough analysis of the various benefits and challenges of the symbiotic relationship between blockchain technology and the current digital forensics approaches, as proposed in the available literature. Based on the findings, we identify various research gaps, and we suggest future research directions that are expected to be of significant value both for academics and practitioners in the field of digital forensics.
CRDec 12, 2019
Exploiting Statistical and Structural Features for the Detection of Domain Generation AlgorithmsConstantinos Patsakis, Fran Casino
Nowadays, malware campaigns have reached a high level of sophistication, thanks to the use of cryptography and covert communication channels over traditional protocols and services. In this regard, a typical approach to evade botnet identification and takedown mechanisms is the use of domain fluxing through the use of Domain Generation Algorithms (DGAs). These algorithms produce an overwhelming amount of domain names that the infected device tries to communicate with to find the Command and Control server, yet only a small fragment of them is actually registered. Due to the high number of domain names, the blacklisting approach is rendered useless. Therefore, the botmaster may pivot the control dynamically and hinder botnet detection mechanisms. To counter this problem, many security mechanisms result in solutions that try to identify domains from a DGA based on the randomness of their name. In this work, we explore hard to detect families of DGAs, as they are constructed to bypass these mechanisms. More precisely, they are based on the use of dictionaries so the domains seem to be user-generated. Therefore, the corresponding generated domains pass many filters that look for, e.g. high entropy strings. To address this challenge, we propose an accurate and efficient probabilistic approach to detect them. We test and validate the proposed solution through extensive experiments with a sound dataset containing all the wordlist-based DGA families that exhibit this behaviour and compare it with other state-of-the-art methods, practically showing the efficacy and prevalence of our proposal.
CRDec 7, 2019
Unravelling Ariadne's Thread: Exploring the Threats of Decentalised DNSConstantinos Patsakis, Fran Casino, Nikolaos Lykousas et al.
The current landscape of the core Internet technologies shows considerable centralisation with the big tech companies controlling the vast majority of traffic and services. This has sparked a wide range of decentralisation initiatives with perhaps the most profound and successful being the blockchain technology. In the past years, a core Internet infrastructure, domain name system (DNS), is being revised mainly due to its inherent security and privacy issues. One of the proposed panaceas is Blockchain-based DNS, which claims to solve many issues of traditional DNS. However, this does not come without security concerns and issues, as any introduction and adoption of a new technology does - let alone a disruptive one such as blockchain. In this work, we discuss a number of associated threats, including emerging ones, and we validate many of them with real-world data. In this regard, we explore a part of the blockchain DNS ecosystem in terms of the browser extensions using such technologies, the chain itself (Namecoin and Emercoin), the domains, and users which have been registered in both platforms. Finally, we provide some countermeasures to address the identified threats, and we propose a fertile common ground for further research.
CRSep 16, 2019
Encrypted and Covert DNS Queries for Botnets: Challenges and CountermeasuresConstantinos Patsakis, Fran Casino, Vasilios Katos
There is a continuous increase in the sophistication that modern malware exercise in order to bypass the deployed security mechanisms. A typical approach to evade the identification and potential takedown of a botnet command and control server is domain fluxing through the use of Domain Generation Algorithms (DGAs). These algorithms produce a vast amount of domain names that the infected device tries to communicate with to find the C&C server, yet only a small fragment of them is actually registered. This allows the botmaster to pivot the control and make the work of seizing the botnet control rather difficult. Current state of the art and practice considers that the DNS queries performed by a compromised device are transparent to the network administrator and therefore can be monitored, analysed, and blocked. In this work, we showcase that the latter is a strong assumption as malware could efficiently hide its DNS queries using covert and/or encrypted channels bypassing the detection mechanisms. To this end, we discuss possible mitigation measures based on traffic analysis to address the new challenges that arise f
CRJul 16, 2019
Blockchain Mutability: Challenges and Proposed SolutionsEugenia Politou, Fran Casino, Efthimios Alepis et al.
Blockchain's evolution during the past decade is astonishing: from bitcoin to over 2.000 altcoins, and from decentralised electronic payments to transactions programmable by smart contracts and complex tokens governed by decentralised organisations. While the new generation of blockchain applications is still evolving, blockchain's technical characteristics are also advancing. Yet, immutability, a hitherto indisputable property according to which blockchain data cannot be edited nor deleted, remains the cornerstone of blockchain's security. Nevertheless, blockchain's immutability is being called into question lately in the light of the new erasing requirements imposed by the GDPR's ``\textit{Right to be Forgotten (RtbF)}'' provision. As the RtbF obliges blockchain data to be editable in order restricted content redactions, modifications or deletions to be applied when requested, blockchains compliance with the regulation is indeed challenging, if not impracticable. Towards resolving this contradiction, various methods and techniques for mutable blockchains have been proposed in an effort to satisfy regulatory erasing requirements while preserving blockchains' security. To this end, this work aims to provide a comprehensive review on the state-of-the-art research approaches, technical workarounds and advanced cryptographic techniques that have been put forward to resolve this conflict and to discuss their potentials, constraints and limitations when applied in the wild to either permissioned or permissionless blockchains.
CRMay 28, 2019
Hydras and IPFS: A Decentralised Playground for MalwareConstantinos Patsakis, Fran Casino
Modern malware can take various forms, and has reached a very high level of sophistication in terms of its penetration, persistence, communication and hiding capabilities. The use of cryptography, and of covert communication channels over public and widely used protocols and services, is becoming a norm. In this work, we start by introducing Resource Identifier Generation Algorithms. These are an extension of a well-known mechanism called Domain Generation Algorithms (DGA), which are frequently employed by cybercriminals for bot management and communication. Our extension allows, beyond DNS, the use of other protocols. More concretely, we showcase the exploitation of the InterPlanetary file system (IPFS). This is a solution for the "permanent web", which enjoys a steadily growing community interest and adoption. The IPFS is, in addition, one of the most prominent solutions for blockchain storage. We go beyond the straightforward case of using the IPFS for hosting malicious content, and explore ways in which a botmaster could employ it, to manage her bots, validating our findings experimentally. Finally, we discuss the advantages of our approach for malware authors, its efficacy and highlight its extensibility for other distributed storage services.
CRMay 28, 2019
HEDGE: Efficient Traffic Classification of Encrypted and Compressed PacketsFran Casino, Kim-Kwang Raymond Choo, Constantinos Patsakis
As the size and source of network traffic increase, so does the challenge of monitoring and analysing network traffic. Therefore, sampling algorithms are often used to alleviate these scalability issues. However, the use of high entropy data streams, through the use of either encryption or compression, further compounds the challenge as current state of the art algorithms cannot accurately and efficiently differentiate between encrypted and compressed packets. In this work, we propose a novel traffic classification method named HEDGE (High Entropy DistinGuishEr) to distinguish between compressed and encrypted traffic. HEDGE is based on the evaluation of the randomness of the data streams and can be applied to individual packets without the need to have access to the entire stream. Findings from the evaluation show that our approach outperforms current state of the art. We also make available our statistically sound dataset, based on known benchmarks, to the wider research community.
AIJun 13, 2017
Technical Report: Implementation and Validation of a Smart Health ApplicationFran Casino, Constantinos Patsakis, Antoni Martinez-Balleste et al.
In this article, we explain in detail the internal structures and databases of a smart health application. Moreover, we describe how to generate a statistically sound synthetic dataset using real-world medical data.