71.2CRMay 28
When AI Meets Wall Street: A Survey on Trustworthy AI in FintechQingwen Zeng, Zhenghao Zhao, Yitian Yang et al.
Artificial intelligence is now embedded as a primary decision engine in continuously operated financial AI pipelines spanning training and updating, deployment and inference, and operation with monitoring and feedback. The automation and scale that make these pipelines effective also create novel attack surfaces, where small algorithmic perturbations can amplify into persistent, system-level financial harm. Existing surveys, however, either treat AI as a defensive tool or analyse adversarial machine learning in a domain-agnostic manner, abstracting away finance-specific constraints such as accounting plausibility, non-IID federated data, continuous retraining, and automation-amplified downstream effects. We address this gap with a unified, lifecycle-centric and mechanism-driven framework. We partition financial AI into three lifecycle stages: training and updating, deployment and inference, and operation, monitoring, and feedback. We further propose the Financial AI Security and Robustness Taxonomy, organising seventeen attack subtypes across data and model poisoning, adversarial attacks on decision boundaries, prompt injection in LLM-mediated workflows, and deepfake-driven subversion of KYC verification layers. For each subtype, we analyse algorithmic strategy, feasibility constraints, stealth and persistence, and downstream financial consequences. Finally, we identify open challenges and outline a research agenda toward lifecycle-aware stress testing and finance-relevant robustness benchmarks.
CRMar 21, 2023
Poisoning Attacks in Federated Edge Learning for Digital Twin 6G-enabled IoTs: An Anticipatory StudyMohamed Amine Ferrag, Burak Kantarci, Lucas C. Cordeiro et al.
Federated edge learning can be essential in supporting privacy-preserving, artificial intelligence (AI)-enabled activities in digital twin 6G-enabled Internet of Things (IoT) environments. However, we need to also consider the potential of attacks targeting the underlying AI systems (e.g., adversaries seek to corrupt data on the IoT devices during local updates or corrupt the model updates); hence, in this article, we propose an anticipatory study for poisoning attacks in federated edge learning for digital twin 6G-enabled IoT environments. Specifically, we study the influence of adversaries on the training and development of federated learning models in digital twin 6G-enabled IoT environments. We demonstrate that attackers can carry out poisoning attacks in two different learning settings, namely: centralized learning and federated learning, and successful attacks can severely reduce the model's accuracy. We comprehensively evaluate the attacks on a new cyber security dataset designed for IoT applications with three deep neural networks under the non-independent and identically distributed (Non-IID) data and the independent and identically distributed (IID) data. The poisoning attacks, on an attack classification problem, can lead to a decrease in accuracy from 94.93% to 85.98% with IID data and from 94.18% to 30.04% with Non-IID.
CLMar 22, 2023
Towards Understanding the Generalization of Medical Text-to-SQL Models and DatasetsRichard Tarbell, Kim-Kwang Raymond Choo, Glenn Dietrich et al.
Electronic medical records (EMRs) are stored in relational databases. It can be challenging to access the required information if the user is unfamiliar with the database schema or general database fundamentals. Hence, researchers have explored text-to-SQL generation methods that provide healthcare professionals direct access to EMR data without needing a database expert. However, currently available datasets have been essentially "solved" with state-of-the-art models achieving accuracy greater than or near 90%. In this paper, we show that there is still a long way to go before solving text-to-SQL generation in the medical domain. To show this, we create new splits of the existing medical text-to-SQL dataset MIMICSQL that better measure the generalizability of the resulting models. We evaluate state-of-the-art language models on our new split showing substantial drops in performance with accuracy dropping from up to 92% to 28%, thus showing substantial room for improvement. Moreover, we introduce a novel data augmentation approach to improve the generalizability of the language models. Overall, this paper is the first step towards developing more robust text-to-SQL models in the medical domain.\footnote{The dataset and code will be released upon acceptance.
CRApr 10, 2022
BABD: A Bitcoin Address Behavior Dataset for Pattern AnalysisYuexin Xiang, Yuchen Lei, Ding Bao et al.
Cryptocurrencies are no longer just the preferred option for cybercriminal activities on darknets, due to the increasing adoption in mainstream applications. This is partly due to the transparency associated with the underpinning ledgers, where any individual can access the record of a transaction record on the public ledger. In this paper, we build a dataset comprising Bitcoin transactions between 12 July 2019 and 26 May 2021. This dataset (hereafter referred to as BABD-13) contains 13 types of Bitcoin addresses, 5 categories of indicators with 148 features, and 544,462 labeled data, which is the largest labeled Bitcoin address behavior dataset publicly available to our knowledge. We then use our proposed dataset on common machine learning models, namely: k-nearest neighbors algorithm, decision tree, random forest, multilayer perceptron, and XGBoost. The results show that the accuracy rates of these machine learning models for the multi-classification task on our proposed dataset are between 93.24% and 97.13%. We also analyze the proposed features and their relationships from the experiments, and propose a k-hop subgraph generation algorithm to extract a k-hop subgraph from the entire Bitcoin transaction graph constructed by the directed heterogeneous multigraph starting from a specific Bitcoin address node (e.g., a known transaction associated with a criminal investigation). Besides, we initially analyze the behavior patterns of different types of Bitcoin addresses according to the extracted features.
LGOct 20, 2023
EASTER: Embedding Aggregation-based Heterogeneous Models Training in Vertical Federated LearningShuo Wang, Keke Gai, Jing Yu et al.
Vertical federated learning has garnered significant attention as it allows clients to train machine learning models collaboratively without sharing local data, which protects the client's local private data. However, existing VFL methods face challenges when dealing with heterogeneous local models among participants, which affects optimization convergence and generalization. To address this challenge, this paper proposes a novel approach called Vertical federated learning for training multiple Heterogeneous models (VFedMH). VFedMH focuses on aggregating the local embeddings of each participant's knowledge during forward propagation. To protect the participants' local embedding values, we propose an embedding protection method based on lightweight blinding factors. In particular, participants obtain local embedding using local heterogeneous models. Then the passive party, who owns only features of the sample, injects the blinding factor into the local embedding and sends it to the active party. The active party aggregates local embeddings to obtain global knowledge embeddings and sends them to passive parties. The passive parties then utilize the global embeddings to propagate forward on their local heterogeneous networks. However, the passive party does not own the sample labels, so the local model gradient cannot be calculated locally. To overcome this limitation, the active party assists the passive party in computing its local heterogeneous model gradients. Then, each participant trains their local model using the heterogeneous model gradients. The objective is to minimize the loss value of their respective local heterogeneous models. Extensive experiments are conducted to demonstrate that VFedMH can simultaneously train multiple heterogeneous models with heterogeneous optimization and outperform some recent methods in model performance.
CVDec 21, 2023Code
MFABA: A More Faithful and Accelerated Boundary-based Attribution Method for Deep Neural NetworksZhiyu Zhu, Huaming Chen, Jiayu Zhang et al.
To better understand the output of deep neural networks (DNN), attribution based methods have been an important approach for model interpretability, which assign a score for each input dimension to indicate its importance towards the model outcome. Notably, the attribution methods use the axioms of sensitivity and implementation invariance to ensure the validity and reliability of attribution results. Yet, the existing attribution methods present challenges for effective interpretation and efficient computation. In this work, we introduce MFABA, an attribution algorithm that adheres to axioms, as a novel method for interpreting DNN. Additionally, we provide the theoretical proof and in-depth analysis for MFABA algorithm, and conduct a large scale experiment. The results demonstrate its superiority by achieving over 101.5142 times faster speed than the state-of-the-art attribution algorithms. The effectiveness of MFABA is thoroughly evaluated through the statistical analysis in comparison to other methods, and the full implementation package is open-source at: https://github.com/LMBTough/MFABA
CVJan 11, 2024Code
GE-AdvGAN: Improving the transferability of adversarial samples by gradient editing-based adversarial generative modelZhiyu Zhu, Huaming Chen, Xinyi Wang et al.
Adversarial generative models, such as Generative Adversarial Networks (GANs), are widely applied for generating various types of data, i.e., images, text, and audio. Accordingly, its promising performance has led to the GAN-based adversarial attack methods in the white-box and black-box attack scenarios. The importance of transferable black-box attacks lies in their ability to be effective across different models and settings, more closely aligning with real-world applications. However, it remains challenging to retain the performance in terms of transferable adversarial examples for such methods. Meanwhile, we observe that some enhanced gradient-based transferable adversarial attack algorithms require prolonged time for adversarial sample generation. Thus, in this work, we propose a novel algorithm named GE-AdvGAN to enhance the transferability of adversarial samples whilst improving the algorithm's efficiency. The main approach is via optimising the training process of the generator parameters. With the functional and characteristic similarity analysis, we introduce a novel gradient editing (GE) mechanism and verify its feasibility in generating transferable samples on various models. Moreover, by exploring the frequency domain information to determine the gradient editing direction, GE-AdvGAN can generate highly transferable adversarial samples while minimizing the execution time in comparison to the state-of-the-art transferable adversarial attack algorithms. The performance of GE-AdvGAN is comprehensively evaluated by large-scale experiments on different datasets, which results demonstrate the superiority of our algorithm. The code for our algorithm is available at: https://github.com/LMBTough/GE-advGAN
LGDec 8, 2024Code
DapperFL: Domain Adaptive Federated Learning with Model Fusion Pruning for Edge DevicesYongzhe Jia, Xuyun Zhang, Hongsheng Hu et al.
Federated learning (FL) has emerged as a prominent machine learning paradigm in edge computing environments, enabling edge devices to collaboratively optimize a global model without sharing their private data. However, existing FL frameworks suffer from efficacy deterioration due to the system heterogeneity inherent in edge computing, especially in the presence of domain shifts across local data. In this paper, we propose a heterogeneous FL framework DapperFL, to enhance model performance across multiple domains. In DapperFL, we introduce a dedicated Model Fusion Pruning (MFP) module to produce personalized compact local models for clients to address the system heterogeneity challenges. The MFP module prunes local models with fused knowledge obtained from both local and remaining domains, ensuring robustness to domain shifts. Additionally, we design a Domain Adaptive Regularization (DAR) module to further improve the overall performance of DapperFL. The DAR module employs regularization generated by the pruned model, aiming to learn robust representations across domains. Furthermore, we introduce a specific aggregation algorithm for aggregating heterogeneous local models with tailored architectures and weights. We implement DapperFL on a realworld FL platform with heterogeneous clients. Experimental results on benchmark datasets with multiple domains demonstrate that DapperFL outperforms several state-of-the-art FL frameworks by up to 2.28%, while significantly achieving model volume reductions ranging from 20% to 80%. Our code is available at: https://github.com/jyzgh/DapperFL.
CVAug 14, 2020Code
Generating Image Adversarial Examples by Embedding Digital WatermarksYuexin Xiang, Tiantian Li, Wei Ren et al.
With the increasing attention to deep neural network (DNN) models, attacks are also upcoming for such models. For example, an attacker may carefully construct images in specific ways (also referred to as adversarial examples) aiming to mislead the DNN models to output incorrect classification results. Similarly, many efforts are proposed to detect and mitigate adversarial examples, usually for certain dedicated attacks. In this paper, we propose a novel digital watermark-based method to generate image adversarial examples to fool DNN models. Specifically, partial main features of the watermark image are embedded into the host image almost invisibly, aiming to tamper with and damage the recognition capabilities of the DNN models. We devise an efficient mechanism to select host images and watermark images and utilize the improved discrete wavelet transform (DWT) based Patchwork watermarking algorithm with a set of valid hyperparameters to embed digital watermarks from the watermark image dataset into original images for generating image adversarial examples. The experimental results illustrate that the attack success rate on common DNN models can reach an average of 95.47% on the CIFAR-10 dataset and the highest at 98.71%. Besides, our scheme is able to generate a large number of adversarial examples efficiently, concretely, an average of 1.17 seconds for completing the attacks on each image on the CIFAR-10 dataset. In addition, we design a baseline experiment using the watermark images generated by Gaussian noise as the watermark image dataset that also displays the effectiveness of our scheme. Similarly, we also propose the modified discrete cosine transform (DCT) based Patchwork watermarking algorithm. To ensure repeatability and reproducibility, the source code is available on GitHub.
LGDec 18, 2025
Feature-Selective Representation Misdirection for Machine UnlearningTaozhao Chen, Linghan Huang, Kim-Kwang Raymond Choo et al.
As large language models (LLMs) are increasingly adopted in safety-critical and regulated sectors, the retention of sensitive or prohibited knowledge introduces escalating risks, ranging from privacy leakage to regulatory non-compliance to to potential misuse, and so on. Recent studies suggest that machine unlearning can help ensure deployed models comply with evolving legal, safety, and governance requirements. However, current unlearning techniques assume clean separation between forget and retain datasets, which is challenging in operational settings characterized by highly entangled distributions. In such scenarios, perturbation-based methods often degrade general model utility or fail to ensure safety. To address this, we propose Selective Representation Misdirection for Unlearning (SRMU), a novel principled activation-editing framework that enforces feature-aware and directionally controlled perturbations. Unlike indiscriminate model weights perturbations, SRMU employs a structured misdirection vector with an activation importance map. The goal is to allow SRMU selectively suppresses harmful representations while preserving the utility on benign ones. Experiments are conducted on the widely used WMDP benchmark across low- and high-entanglement configurations. Empirical results reveal that SRMU delivers state-of-the-art unlearning performance with minimal utility losses, and remains effective under 20-30\% overlap where existing baselines collapse. SRMU provides a robust foundation for safety-driven model governance, privacy compliance, and controlled knowledge removal in the emerging LLM-based applications. We release the replication package at https://figshare.com/s/d5931192a8824de26aff.
44.3AIApr 26
Agentic Adversarial Rewriting Exposes Architectural Vulnerabilities in Black-Box NLP PipelinesMazal Bethany, Kim-Kwang Raymond Choo, Nishant Vishwamitra et al.
Multi-component natural language processing (NLP) pipelines are increasingly deployed for high-stakes decisions, yet no existing adversarial method can test their robustness under realistic conditions: binary-only feedback, no gradient access, and strict query budgets. We formalize this strict black-box threat model and propose a two-agent evasion framework operating in a semantic perturbation space. An Attacker Agent generates meaning-preserving rewrites while a Prompt Optimization Agent refines the attack strategy using only binary decision feedback within a 10-query budget. Evaluated against four evidence-based misinformation detection pipelines, the framework achieves evasion rates of 19.95 to 40.34% on modern large language model (LLM) based systems, compared to at most 3.90% for token-level perturbation baselines that rely on surrogate models because they cannot operate under our threat model. A legacy system relying on static lexical retrieval exhibits near-total vulnerability 97.02%, establishing a lower bound that exposes how architectural choices govern the attack surface. Evasion effectiveness is associated with three architectural properties: evidence retrieval mechanism, retrieval-inference coupling, and baseline classification accuracy. The iterative prompt optimization yields the largest marginal gains against the most robust targets, confirming that adaptive strategy discovery is essential when evasion is non-trivial. Analysis of successful rewrites reveals four exploitation patterns, each targeting failures at distinct pipeline stages. A pattern-informed defense reduces the evasion rate by up to 65.18%.
LGDec 27, 2023
FairCompass: Operationalising Fairness in Machine LearningJessica Liu, Huaming Chen, Jun Shen et al.
As artificial intelligence (AI) increasingly becomes an integral part of our societal and individual activities, there is a growing imperative to develop responsible AI solutions. Despite a diverse assortment of machine learning fairness solutions is proposed in the literature, there is reportedly a lack of practical implementation of these tools in real-world applications. Industry experts have participated in thorough discussions on the challenges associated with operationalising fairness in the development of machine learning-empowered solutions, in which a shift toward human-centred approaches is promptly advocated to mitigate the limitations of existing techniques. In this work, we propose a human-in-the-loop approach for fairness auditing, presenting a mixed visual analytical system (hereafter referred to as 'FairCompass'), which integrates both subgroup discovery technique and the decision tree-based schema for end users. Moreover, we innovatively integrate an Exploration, Guidance and Informed Analysis loop, to facilitate the use of the Knowledge Generation Model for Visual Analytics in FairCompass. We evaluate the effectiveness of FairCompass for fairness auditing in a real-world scenario, and the findings demonstrate the system's potential for real-world deployability. We anticipate this work will address the current gaps in research for fairness and facilitate the operationalisation of fairness in machine learning systems.
CRFeb 2, 2025
`Do as I say not as I do': A Semi-Automated Approach for Jailbreak Prompt Attack against Multimodal LLMsChun Wai Chiu, Linghan Huang, Bo Li et al.
Large Language Models (LLMs) have seen widespread applications across various domains due to their growing ability to process diverse types of input data, including text, audio, image and video. While LLMs have demonstrated outstanding performance in understanding and generating contexts for different scenarios, they are vulnerable to prompt-based attacks, which are mostly via text input. In this paper, we introduce the first voice-based jailbreak attack against multimodal LLMs, termed as Flanking Attack, which can process different types of input simultaneously towards the multimodal LLMs. Our work is motivated by recent advancements in monolingual voice-driven large language models, which have introduced new attack surfaces beyond traditional text-based vulnerabilities for LLMs. To investigate these risks, we examine the state-of-the-art multimodal LLMs, which can be accessed via different types of inputs such as audio input, focusing on how adversarial prompts can bypass its defense mechanisms. We propose a novel strategy, in which the disallowed prompt is flanked by benign, narrative-driven prompts. It is integrated in the Flanking Attack which attempts to humanizes the interaction context and execute the attack through a fictional setting. Further, to better evaluate the attack performance, we present a semi-automated self-assessment framework for policy violation detection. We demonstrate that Flanking Attack is capable of manipulating state-of-the-art LLMs into generating misaligned and forbidden outputs, which achieves an average attack success rate ranging from 0.67 to 0.93 across seven forbidden scenarios.
CRJan 30, 2025
Large Language Models for Cryptocurrency Transaction Analysis: A Bitcoin Case StudyYuchen Lei, Yuexin Xiang, Qin Wang et al.
Cryptocurrencies are widely used, yet current methods for analyzing transactions often rely on opaque, black-box models. While these models may achieve high performance, their outputs are usually difficult to interpret and adapt, making it challenging to capture nuanced behavioral patterns. Large language models (LLMs) have the potential to address these gaps, but their capabilities in this area remain largely unexplored, particularly in cybercrime detection. In this paper, we test this hypothesis by applying LLMs to real-world cryptocurrency transaction graphs, with a focus on Bitcoin, one of the most studied and widely adopted blockchain networks. We introduce a three-tiered framework to assess LLM capabilities: foundational metrics, characteristic overview, and contextual interpretation. This includes a new, human-readable graph representation format, LLM4TG, and a connectivity-enhanced transaction graph sampling algorithm, CETraS. Together, they significantly reduce token requirements, transforming the analysis of multiple moderately large-scale transaction graphs with LLMs from nearly impossible to feasible under strict token limits. Experimental results demonstrate that LLMs have outstanding performance on foundational metrics and characteristic overview, where the accuracy of recognizing most basic information at the node level exceeds 98.50% and the proportion of obtaining meaningful characteristics reaches 95.00%. Regarding contextual interpretation, LLMs also demonstrate strong performance in classification tasks, even with very limited labeled data, where top-3 accuracy reaches 72.43% with explanations. While the explanations are not always fully accurate, they highlight the strong potential of LLMs in this domain. At the same time, several limitations persist, which we discuss along with directions for future research.
CLJun 25, 2024
Beyond Text-to-SQL for IoT Defense: A Comprehensive Framework for Querying and Classifying IoT ThreatsRyan Pavlich, Nima Ebadi, Richard Tarbell et al.
Recognizing the promise of natural language interfaces to databases, prior studies have emphasized the development of text-to-SQL systems. While substantial progress has been made in this field, existing research has concentrated on generating SQL statements from text queries. The broader challenge, however, lies in inferring new information about the returned data. Our research makes two major contributions to address this gap. First, we introduce a novel Internet-of-Things (IoT) text-to-SQL dataset comprising 10,985 text-SQL pairs and 239,398 rows of network traffic activity. The dataset contains additional query types limited in prior text-to-SQL datasets, notably temporal-related queries. Our dataset is sourced from a smart building's IoT ecosystem exploring sensor read and network traffic data. Second, our dataset allows two-stage processing, where the returned data (network traffic) from a generated SQL can be categorized as malicious or not. Our results show that joint training to query and infer information about the data can improve overall text-to-SQL performance, nearly matching substantially larger models. We also show that current large language models (e.g., GPT3.5) struggle to infer new information about returned data, thus our dataset provides a novel test bed for integrating complex domain-specific reasoning into LLMs.
CRMay 7, 2024
A2-DIDM: Privacy-preserving Accumulator-enabled Auditing for Distributed Identity of DNN ModelTianxiu Xie, Keke Gai, Jing Yu et al.
Recent booming development of Generative Artificial Intelligence (GenAI) has facilitated an emerging model commercialization for the purpose of reinforcement on model performance, such as licensing or trading Deep Neural Network (DNN) models. However, DNN model trading may trigger concerns of the unauthorized replications or misuses over the model, so that the benefit of the model ownership will be violated. Model identity auditing is a challenging issue in protecting intellectual property of DNN models and verifying the integrity and ownership of models for guaranteeing trusts in transactions is one of the critical obstacles. In this paper, we focus on the above issue and propose a novel Accumulator-enabled Auditing for Distributed Identity of DNN Model (A2-DIDM) that utilizes blockchain and zero-knowledge techniques to protect data and function privacy while ensuring the lightweight on-chain ownership verification. The proposed model presents a scheme of identity records via configuring model weight checkpoints with corresponding zero-knowledge proofs, which incorporates predicates to capture incremental state changes in model weight checkpoints. Our scheme ensures both computational integrity of DNN training process and programmability, so that the uniqueness of the weight checkpoint sequence in a DNN model is preserved, ensuring the correctness of the model identity auditing. In addition, A2-DIDM also addresses privacy protections in distributed identity via a proposed method of accumulators. We systematically analyze the security and robustness of our proposed model and further evaluate the effectiveness and usability of auditing DNN model identities.
LGMay 3, 2024
Holistic Evaluation Metrics: Use Case Sensitive Evaluation Metrics for Federated LearningYanli Li, Jehad Ibrahim, Huaming Chen et al.
A large number of federated learning (FL) algorithms have been proposed for different applications and from varying perspectives. However, the evaluation of such approaches often relies on a single metric (e.g., accuracy). Such a practice fails to account for the unique demands and diverse requirements of different use cases. Thus, how to comprehensively evaluate an FL algorithm and determine the most suitable candidate for a designated use case remains an open question. To mitigate this research gap, we introduce the Holistic Evaluation Metrics (HEM) for FL in this work. Specifically, we collectively focus on three primary use cases, which are Internet of Things (IoT), smart devices, and institutions. The evaluation metric encompasses various aspects including accuracy, convergence, computational efficiency, fairness, and personalization. We then assign a respective importance vector for each use case, reflecting their distinct performance requirements and priorities. The HEM index is finally generated by integrating these metric components with their respective importance vectors. Through evaluating different FL algorithms in these three prevalent use cases, our experimental results demonstrate that HEM can effectively assess and identify the FL algorithms best suited to particular scenarios. We anticipate this work sheds light on the evaluation process for pragmatic FL algorithms in real-world applications.
NIJun 27, 2021
A Systematic Review of Bio-Cyber Interface Technologies and Security Issues for Internet of Bio-Nano ThingsSidra Zafar, Mohsin Nazir, Taimur Bakhshi et al.
Advances in synthetic biology and nanotechnology have contributed to the design of tools that can be used to control, reuse, modify, and re-engineer cells' structure, as well as enabling engineers to effectively use biological cells as programmable substrates to realize Bio-Nano Things (biological embedded computing devices). Bio-NanoThings are generally tiny, non-intrusive, and concealable devices that can be used for in-vivo applications such as intra-body sensing and actuation networks, where the use of artificial devices can be detrimental. Such (nano-scale) devices can be used in various healthcare settings such as continuous health monitoring, targeted drug delivery, and nano-surgeries. These services can also be grouped to form a collaborative network (i.e., nanonetwork), whose performance can potentially be improved when connected to higher bandwidth external networks such as the Internet, say via 5G. However, to realize the IoBNT paradigm, it is also important to seamlessly connect the biological environment with the technological landscape by having a dynamic interface design to convert biochemical signals from the human body into an equivalent electromagnetic signal (and vice versa). This, unfortunately, risks the exposure of internal biological mechanisms to cyber-based sensing and medical actuation, with potential security and privacy implications. This paper comprehensively reviews bio-cyber interface for IoBNT architecture, focusing on bio-cyber interfacing options for IoBNT like biologically inspired bio-electronic devices, RFID enabled implantable chips, and electronic tattoos. This study also identifies known and potential security and privacy vulnerabilities and mitigation strategies for consideration in future IoBNT designs and implementations.
CVMay 19, 2021
A Lightweight Privacy-Preserving Scheme Using Label-based Pixel Block Mixing for Image Classification in Deep LearningYuexin Xiang, Tiantian Li, Wei Ren et al.
To ensure the privacy of sensitive data used in the training of deep learning models, a number of privacy-preserving methods have been designed by the research community. However, existing schemes are generally designed to work with textual data, or are not efficient when a large number of images is used for training. Hence, in this paper we propose a lightweight and efficient approach to preserve image privacy while maintaining the availability of the training set. Specifically, we design the pixel block mixing algorithm for image classification privacy preservation in deep learning. To evaluate its utility, we use the mixed training set to train the ResNet50, VGG16, InceptionV3 and DenseNet121 models on the WIKI dataset and the CNBC face dataset. Experimental findings on the testing set show that our scheme preserves image privacy while maintaining the availability of the training set in the deep learning models. Additionally, the experimental results demonstrate that we achieve good performance for the VGG16 model on the WIKI dataset and both ResNet50 and DenseNet121 on the CNBC dataset. The pixel block algorithm achieves fairly high efficiency in the mixing of the images, and it is computationally challenging for the attackers to restore the mixed training set to the original training set. Moreover, data augmentation can be applied to the mixed training set to improve the training's effectiveness.
CYMay 16, 2021
Investigating Protected Health Information Leakage from Android Medical ApplicationsGeorge Grispos, Talon Flynn, William Glisson et al.
As smartphones and smartphone applications are widely used in a healthcare context (e.g., remote healthcare), these devices and applications may need to comply with the Health Insurance Portability and Accountability Act (HIPAA) of 1996. In other words, adequate safeguards to protect the user's sensitive information (e.g., personally identifiable information and/or medical history) are required to be enforced on such devices and applications. In this study, we forensically focus on the potential of recovering residual data from Android medical applications, with the objective of providing an initial risk assessment of such applications. Our findings (e.g., documentation of the artifacts) also contribute to a better understanding of the types and location of evidential artifacts that can, potentially, be recovered from these applications in a digital forensic investigation.
CRMay 14, 2021
Consumer, Commercial and Industrial IoT (In)Security: Attack Taxonomy and Case StudiesChristos Xenofontos, Ioannis Zografopoulos, Charalambos Konstantinou et al.
Internet of Things (IoT) devices are becoming ubiquitous in our lives, with applications spanning from the consumer domain to commercial and industrial systems. The steep growth and vast adoption of IoT devices reinforce the importance of sound and robust cybersecurity practices during the device development life-cycles. IoT-related vulnerabilities, if successfully exploited can affect, not only the device itself, but also the application field in which the IoT device operates. Evidently, identifying and addressing every single vulnerability is an arduous, if not impossible, task. Attack taxonomies can assist in classifying attacks and their corresponding vulnerabilities. Security countermeasures and best practices can then be leveraged to mitigate threats and vulnerabilities before they emerge into catastrophic attacks and ensure overall secure IoT operation. Therefore, in this paper, we provide an attack taxonomy which takes into consideration the different layers of IoT stack, i.e., device, infrastructure, communication, and service, and each layer's designated characteristics which can be exploited by adversaries. Furthermore, using nine real-world cybersecurity incidents, that had targeted IoT devices deployed in the consumer, commercial, and industrial sectors, we describe the IoT-related vulnerabilities, exploitation procedures, attacks, impacts, and potential mitigation mechanisms and protection strategies. These (and many other) incidents highlight the underlying security concerns of IoT systems and demonstrate the potential attack impacts of such connected ecosystems, while the proposed taxonomy provides a systematic procedure to categorize attacks based on the affected layer and corresponding impact.
CRDec 21, 2020
DeepKeyGen: A Deep Learning-based Stream Cipher Generator for Medical Image Encryption and DecryptionYi Ding, Fuyuan Tan, Zhen Qin et al.
The need for medical image encryption is increasingly pronounced, for example to safeguard the privacy of the patients' medical imaging data. In this paper, a novel deep learning-based key generation network (DeepKeyGen) is proposed as a stream cipher generator to generate the private key, which can then be used for encrypting and decrypting of medical images. In DeepKeyGen, the generative adversarial network (GAN) is adopted as the learning network to generate the private key. Furthermore, the transformation domain (that represents the "style" of the private key to be generated) is designed to guide the learning network to realize the private key generation process. The goal of DeepKeyGen is to learn the mapping relationship of how to transfer the initial image to the private key. We evaluate DeepKeyGen using three datasets, namely: the Montgomery County chest X-ray dataset, the Ultrasonic Brachial Plexus dataset, and the BraTS18 dataset. The evaluation findings and security analysis show that the proposed key generation network can achieve a high-level security in generating the private key.
CROct 19, 2020
A Survey of Machine Learning Techniques in Adversarial Image ForensicsEhsan Nowroozi, Ali Dehghantanha, Reza M. Parizi et al.
Image forensic plays a crucial role in both criminal investigations (e.g., dissemination of fake images to spread racial hate or false narratives about specific ethnicity groups) and civil litigation (e.g., defamation). Increasingly, machine learning approaches are also utilized in image forensics. However, there are also a number of limitations and vulnerabilities associated with machine learning-based approaches, for example how to detect adversarial (image) examples, with real-world consequences (e.g., inadmissible evidence, or wrongful conviction). Therefore, with a focus on image forensics, this paper surveys techniques that can be used to enhance the robustness of machine learning-based binary manipulation detectors in various adversarial scenarios.
CRSep 23, 2020
Pocket Diagnosis: Secure Federated Learning against Poisoning Attack in the CloudZhuoran Ma, Jianfeng Ma, Yinbin Miao et al.
Federated learning has become prevalent in medical diagnosis due to its effectiveness in training a federated model among multiple health institutions (i.e. Data Islands (DIs)). However, increasingly massive DI-level poisoning attacks have shed light on a vulnerability in federated learning, which inject poisoned data into certain DIs to corrupt the availability of the federated model. Previous works on federated learning have been inadequate in ensuring the privacy of DIs and the availability of the final federated model. In this paper, we design a secure federated learning mechanism with multiple keys to prevent DI-level poisoning attacks for medical diagnosis, called SFPA. Concretely, SFPA provides privacy-preserving random forest-based federated learning by using the multi-key secure computation, which guarantees the confidentiality of DI-related information. Meanwhile, a secure defense strategy over encrypted locally-submitted models is proposed to defense DI-level poisoning attacks. Finally, our formal security analysis and empirical tests on a public cloud platform demonstrate the security and efficiency of SFPA as well as its capability of resisting DI-level poisoning attacks.
CRMay 18, 2020
VerifyTL: Secure and Verifiable Collaborative Transfer LearningZhuoran Ma, Jianfeng Ma, Yinbin Miao et al.
Getting access to labelled datasets in certain sensitive application domains can be challenging. Hence, one often resorts to transfer learning to transfer knowledge learned from a source domain with sufficient labelled data to a target domain with limited labelled data. However, most existing transfer learning techniques only focus on one-way transfer which brings no benefit to the source domain. In addition, there is the risk of a covert adversary corrupting a number of domains, which can consequently result in inaccurate prediction or privacy leakage. In this paper we construct a secure and Verifiable collaborative Transfer Learning scheme, VerifyTL, to support two-way transfer learning over potentially untrusted datasets by improving knowledge transfer from a target domain to a source domain. Further, we equip VerifyTL with a cross transfer unit and a weave transfer unit employing SPDZ computation to provide privacy guarantee and verification in the two-domain setting and the multi-domain setting, respectively. Thus, VerifyTL is secure against covert adversary that can compromise up to n-1 out of n data domains. We analyze the security of VerifyTL and evaluate its performance over two real-world datasets. Experimental results show that VerifyTL achieves significant performance gains over existing secure learning schemes.
CRJun 12, 2019
Integrating Privacy Enhancing Techniques into Blockchains Using SidechainsReza M. Parizi, Sajad Homayoun, Abbas Yazdinejad et al.
Blockchains are turning into decentralized computing platforms and are getting worldwide recognition for their unique advantages. There is an emerging trend beyond payments that blockchains could enable a new breed of decentralized applications, and serve as the foundation for Internet's security infrastructure. The immutable nature of the blockchain makes it a winner on security and transparency; it is nearly inconceivable for ledgers to be altered in a way not instantly clear to every single user involved. However, most blockchains fall short in privacy aspects, particularly in data protection. Garlic Routing and Onion Routing are two of major Privacy Enhancing Techniques (PETs) which are popular for anonymization and security. Garlic Routing is a methodology using by I2P Anonymous Network to hide the identity of sender and receiver of data packets by bundling multiple messages into a layered encryption structure. The Onion Routing attempts to provide lowlatency Internet-based connections that resist traffic analysis, deanonymization attack, eavesdropping, and other attacks both by outsiders (e.g. Internet routers) and insiders (Onion Routing servers themselves). As there are a few controversies over the rate of resistance of these two techniques to privacy attacks, we propose a PET-Enabled Sidechain (PETES) as a new privacy enhancing technique by integrating Garlic Routing and Onion Routing into a Garlic Onion Routing (GOR) framework suitable to the structure of blockchains. The preliminary proposed GOR aims to improve the privacy of transactions in blockchains via PETES structure.
CRJun 12, 2019
A Blockchain-based Framework for Detecting Malicious Mobile Applications in App StoresSajad Homayoun, Ali Dehghantanha, Reza M. Parizi et al.
The dramatic growth in smartphone malware shows that malicious program developers are shifting from traditional PC systems to smartphone devices. Therefore, security researchers are also moving towards proposing novel antimalware methods to provide adequate protection. This paper proposes a Blockchain-Based Malware Detection Framework (B2MDF) for detecting malicious mobile applications in mobile applications marketplaces (app stores). The framework consists of two internal and external private blockchains forming a dual private blockchain as well as a consortium blockchain for the final decision. The internal private blockchain stores feature blocks extracted by both static and dynamic feature extractors, while the external blockchain stores detection results as blocks for current versions of applications. B2MDF also shares feature blocks with third parties, and this helps antimalware vendors to provide more accurate solutions.
CRMay 28, 2019
HEDGE: Efficient Traffic Classification of Encrypted and Compressed PacketsFran Casino, Kim-Kwang Raymond Choo, Constantinos Patsakis
As the size and source of network traffic increase, so does the challenge of monitoring and analysing network traffic. Therefore, sampling algorithms are often used to alleviate these scalability issues. However, the use of high entropy data streams, through the use of either encryption or compression, further compounds the challenge as current state of the art algorithms cannot accurately and efficiently differentiate between encrypted and compressed packets. In this work, we propose a novel traffic classification method named HEDGE (High Entropy DistinGuishEr) to distinguish between compressed and encrypted traffic. HEDGE is based on the evaluation of the randomness of the data streams and can be applied to individual packets without the need to have access to the entire stream. Findings from the evaluation show that our approach outperforms current state of the art. We also make available our statistically sound dataset, based on known benchmarks, to the wider research community.
CRSep 7, 2018
Empirical Vulnerability Analysis of Automated Smart Contracts Security Testing on BlockchainsReza M. Parizi, Ali Dehghantanha, Kim-Kwang Raymond Choo et al.
The emerging blockchain technology supports decentralized computing paradigm shift and is a rapidly approaching phenomenon. While blockchain is thought primarily as the basis of Bitcoin, its application has grown far beyond cryptocurrencies due to the introduction of smart contracts. Smart contracts are self-enforcing pieces of software, which reside and run over a hosting blockchain. Using blockchain-based smart contracts for secure and transparent management to govern interactions (authentication, connection, and transaction) in Internet-enabled environments, mostly IoT, is a niche area of research and practice. However, writing trustworthy and safe smart contracts can be tremendously challenging because of the complicated semantics of underlying domain-specific languages and its testability. There have been high-profile incidents that indicate blockchain smart contracts could contain various code-security vulnerabilities, instigating financial harms. When it involves security of smart contracts, developers embracing the ability to write the contracts should be capable of testing their code, for diagnosing security vulnerabilities, before deploying them to the immutable environments on blockchains. However, there are only a handful of security testing tools for smart contracts. This implies that the existing research on automatic smart contracts security testing is not adequate and remains in a very stage of infancy. With a specific goal to more readily realize the application of blockchain smart contracts in security and privacy, we should first understand their vulnerabilities before widespread implementation. Accordingly, the goal of this paper is to carry out a far-reaching experimental assessment of current static smart contracts security testing tools, for the most widely used blockchain, the Ethereum and its domain-specific programming language, Solidity to provide the first...
CYAug 6, 2018
Digital Blues: An Investigation into the Use of Bluetooth ProtocolsWilliam Ledbetter, William Bradley Glisson, Todd McDonald et al.
The proliferation of Bluetooth mobile device communications into all aspects of modern society raises security questions by both academicians and practitioners. This environment prompted an investigation into the real-world use of Bluetooth protocols along with an analysis of documented security attacks. The experiment discussed in this paper collected data for one week in a local coffee shop. The data collection took about an hour each day and identified 478 distinct devices. The contribution of this research is two-fold. First, it provides insight into real-world Bluetooth protocols that are being utilized by the general public. Second, it provides foundational research that is necessary for future Bluetooth penetration testing research.
CRAug 3, 2018
Non-Reciprocity Compensation Combined with Turbo Codes for Secret Key Generation in Vehicular Ad Hoc Social IoT NetworksGregory Epiphaniou, Petros Karadimas, Dhouha Kbaier Ben Ismail et al.
The physical attributes of the dynamic vehicle-to-vehicle (V2V) propagation channel can be utilised for the generation of highly random and symmetric cryptographic keys. However, in a physical-layer key agreement scheme, non-reciprocity due to inherent channel noise and hardware impairments can propagate bit disagreements. This has to be addressed prior to the symmetric key generation which is inherently important in social Internet of Things (IoT) networks, including in adversarial settings (e.g. battlefields). In this paper, we parametrically incorporate temporal variability attributes, such as three-dimensional (3D) scattering and scatterers mobility. Accordingly, this is the first work to incorporate such features into the key generation process by combining non-reciprocity compensation with turbo codes. Preliminary results indicate a significant improvement when using Turbo Codes in bit mismatch rate (BMR) and key generation rate (KGR) in comparison to sample indexing techniques.
CRJul 27, 2018
Ubuntu One Investigation: Detecting Evidences on Client MachinesMohammad Shariati, Ali Dehghantanha1, Ben Martini et al.
STorage as a Service (STaaS) cloud services has been adopted by both individuals and businesses as a dominant technology worldwide. Similar to other technologies, this widely accepted service can be misused by criminals. Investigating cloud platforms is becoming a standard component of contemporary digital investigation cases. Hence, digital forensic investigators need to have a working knowledge of the potential evidence that might be stored on cloud services. In this chapter, we conducted a number of experiments to locate data remnants of users' activities when utilizing the Ubuntu One cloud service. We undertook experiments based on common activities performed by users on cloud platforms including downloading, uploading, viewing, and deleting files. We then examined the resulting digital artifacts on a range of client devices, namely, Windows 8.1, Apple Mac OS X, and Apple iOS. Our examination extracted a variety of potentially evidential items ranging from Ubuntu One databases and log files on persistent storage to remnants of user activities in device memory and network traffic.
CRJul 27, 2018
A Cyber Kill Chain Based Taxonomy of Banking Trojans for Evolutionary Computational IntelligenceDennis Kiwia, Ali Dehghantanha, Kim-Kwang Raymond Choo et al.
Malware such as banking Trojans are popular with financially-motivated cybercriminals. Detection of banking Trojans remains a challenging task, due to the constant evolution of techniques used to obfuscate and circumvent existing detection and security solutions. Having a malware taxonomy can facilitate the design of mitigation strategies such as those based on evolutionary computational intelligence. Specifically, in this paper, we propose a cyber kill chain based taxonomy of banking Trojans features. This threat intelligence based taxonomy providing a stage-by-stage operational understanding of a cyber-attack, can be highly beneficial to security practitioners and the design of evolutionary computational intelligence on Trojans detection and mitigation strategy. The proposed taxonomy is validated by using a real-world dataset of 127 banking Trojans collected from December 2014 to January 2016 by a major UK-based financial organisation.
CRJul 27, 2018
Greening Cloud-Enabled Big Data Storage Forensics: Syncany as a Case StudyYee-Yang Teing, Ali Dehghantanha, Kim-Kwang Raymond Choo
The pervasive nature of cloud-enabled big data storage solutions introduces new challenges in the identification, collection, analysis, preservation and archiving of digital evidences. Investigation of such complex platforms to locate and recover traces of criminal activities is a time-consuming process. Hence, cyber forensics researchers are moving towards streamlining the investigation process by locating and documenting residual artefacts (evidences) of forensic value of users activities on cloud-enabled big data platforms in order to reduce the investigation time and resources involved in a real-world investigation. In this paper, we seek to determine the data remnants of forensic value from Syncany private cloud storage service, a popular storage engine for big data platforms. We demonstrate the types and the locations of the artefacts that can be forensically recovered. Findings from this research contribute to an in-depth understanding of cloud-enabled big data storage forensics, which can result in reduced time and resources spent in real-world investigations involving Syncany-based cloud platforms.
CRJul 26, 2018
CloudMe Forensics: A Case of Big-Data InvestigationYee-Yang Teing, Ali Dehghantanha, Kim-Kwang Raymond Choo
The issue of increasing volume, variety and velocity of has been an area of concern in cloud forensics. The high volume of data will, at some point, become computationally exhaustive to be fully extracted and analysed in a timely manner. To cut down the size of investigation, it is important for a digital forensic practitioner to possess a well-rounded knowledge about the most relevant data artefacts from the cloud product investigating. In this paper, we seek to tackle on the residual artefacts from the use of CloudMe cloud storage service. We demonstrate the types and locations of the artefacts relating to the installation, uninstallation, log-in, log-off, and file synchronisation activities from the computer desktop and mobile clients. Findings from this research will pave the way towards the development of data mining methods for cloud-enabled big data endpoint forensics investigation.
CRJul 22, 2018
Digital forensic investigation of two-way radio communication equipment and servicesArie Kouwen, Mark Scanlon, Kim-Kwang Raymond Choo et al.
Historically, radio-equipment has solely been used as a two-way analogue communication device. Today, the use of radio communication equipment is increasing by numerous organisations and businesses. The functionality of these traditionally short-range devices have expanded to include private call, address book, call-logs, text messages, lone worker, telemetry, data communication, and GPS. Many of these devices also integrate with smartphones, which delivers Push-To-Talk services that make it possible to setup connections between users using a two-way radio and a smartphone. In fact, these devices can be used to connect users only using smartphones. To date, there is little research on the digital traces in modern radio communication equipment. In fact, increasing the knowledge base about these radio communication devices and services can be valuable to law enforcement in a police investigation. In this paper, we investigate what kind of radio communication equipment and services law enforcement digital investigators can encounter at a crime scene or in an investigation. Subsequent to seizure of this radio communication equipment we explore the traces, which may have a forensic interest and how these traces can be acquired. Finally, we test our approach on sample radio communication equipment and services.
CRSep 15, 2017
Performance of Android Forensics Data Recovery ToolsBernard Chukwuemeka Ogazi-Onyemaechi, Ali Dehghantanha, Kim-Kwang Raymond Choo
Recovering deleted or hidden data is among most important duties of forensics investigators. Extensive utilisation of smartphones as subject, objects or tools of crime made them an important part of residual forensics. This chapter investigates the effectiveness of mobile forensic data recovery tools in recovering evidences from a Samsung Galaxy S2 i9100 Android phone. We seek to determine the amount of data that could be recovered using Phone image carver, Access data FTK, Foremost, Diskdigger, and Recover My File forensic tools. The findings reflected the difference between recovery capacities of studied tools showing their suitability in their specialised contexts only.
CRAug 29, 2017
Investigation and Automating Extraction of Thumbnails Produced by Image viewersWybren van der Meer, Kim-Kwang Raymond Choo, Nhien-An Le-Khac et al.
Today, in digital forensics, images normally provide important information within an investigation. However, not all images may still be available within a forensic digital investigation as they were all deleted for example. Data carving can be used in this case to retrieve deleted images but the carving time is normally significant and these images can be moreover overwritten by other data. One of the solutions is to look at thumbnails of images that are no longer available. These thumbnails can often be found within databases created by either operating systems or image viewers. In literature, most research and practical focus on the extraction of thumbnails from databases created by the operating system. There is a little research working on the thumbnails created by the image reviewers as these thumbnails are application-driven in terms of pre-defined sizes, adjustments and storage location. Eventually, thumbnail databases from image viewers are significant forensic artefacts for investigators as these programs deal with large amounts of images. However, investigating these databases so far is still manual or semi-automatic task that leads to the huge amount of forensic time. Therefore, in this paper we propose a new approach of automating extraction of thumbnails produced by image viewers. We also test our approach with popular image viewers in different storage structures and locations to show its robustness.
CRAug 17, 2017
Medical Cyber-Physical Systems Development: A Forensics-Driven ApproachGeorge Grispos, William Bradley Glisson, Kim-Kwang Raymond Choo
The synthesis of technology and the medical industry has partly contributed to the increasing interest in Medical Cyber-Physical Systems (MCPS). While these systems provide benefits to patients and professionals, they also introduce new attack vectors for malicious actors (e.g. financially-and/or criminally-motivated actors). A successful breach involving a MCPS can impact patient data and system availability. The complexity and operating requirements of a MCPS complicates digital investigations. Coupling this information with the potentially vast amounts of information that a MCPS produces and/or has access to is generating discussions on, not only, how to compromise these systems but, more importantly, how to investigate these systems. The paper proposes the integration of forensics principles and concepts into the design and development of a MCPS to strengthen an organization's investigative posture. The framework sets the foundation for future research in the refinement of specific solutions for MCPS investigations.
CRJul 15, 2017
Forensic Investigation of P2P Cloud Storage: BitTorrent Sync as a Case StudyTeing Yee Yang, Ali Dehghantanha, Kim-Kwang Raymond Choo et al.
Cloud computing has been regarded as the technology enabler for the Internet of Things (IoT). To ensure the most effective collection of IoT-based evidence, it is vital for forensic practitioners to possess a contemporary understanding of the artefacts from different cloud services. In this paper, we seek to determine the data remnants from the use of BitTorrent Sync version 2.0. Findings from our research using mobile and computer devices running Windows 8.1, Mac OS X Mavericks 10.9.5, Ubuntu 14.04.1 LTS, iOS 7.1.2, and Android KitKat 4.4.4 suggested that artefacts relating to the installation, uninstallation, log-in, log-off, and file synchronisation could be recovered, which are potential sources of IoT forensics. We also present a forensically sound investigation methodology for BitTorrent Sync.
CRJun 25, 2017
Investigating America Online Instant Messaging Application: Data Remnants on Windows 8.1 Client MachineTeing Yee Yang, Ali Dehghantanha, Kim-Kwang Raymond Choo et al.
Instant messaging applications (apps) are one potential source of evidence in a criminal investigation or a civil litigation. To ensure the most effective collection of evidence, it is vital for forensic practitioners to possess an up-to-date knowledge about artefacts of forensic interest from various instant messaging apps. Hence, in this chapter, we study America Online Instant Messenger (version 7.14.5.8) with the aims of contributing to an in-depth understanding of the types of terrestrial artefacts that are likely to remain after the use of instant messaging services and app on Windows 8.1 devices. Potential artefacts identified during the research include data relating to the installation or uninstallation, log-in and log-off information, contact lists, conversations, and transferred files.
CRJun 25, 2017
Honeypots for employee information security awareness and education training: A conceptual EASY training modelLek Christopher, Kim-Kwang Raymond Choo, Ali Dehghantanha
The increasing pervasiveness of internet-connected systems means that such systems will continue to be exploited for criminal purposes by cybercriminals (including malicious insiders such as employees and vendors). The importance of protecting corporate system and intellectual property, and the escalating complexities of the online environment underscore the need for ongoing information security awareness and education training and the promotion of a culture of security among employees. Two honeypots were deployed at a private university based in Singapore. Findings from the analysis of the honeypot data are presented in this paper. This paper then examines how analysis of honeypot data can be used in employee information security awareness and education training. Adapting the Routine Activity Theory, a criminology theory widely used in the study of cybercrime, this paper proposes a conceptual Engaging Stakeholders, Acceptable Behavior, Simple Teaching method, Yardstick (EASY) training model, and explains how the model can be used to design employee information security awareness and education training. Future research directions are also outlined in this paper.
CRJun 25, 2017
Cloud Storage Forensics: Analysis of Data Remnants on SpiderOak, JustCloud, and pCloudSeyedHossein Mohtasebi, Ali Dehghantanha, Kim-Kwang Raymond Choo
STorage as a Service (STaaS) cloud platforms benefits such as getting access to data anywhere, anytime, on a wide range of devices made them very popular among businesses and individuals. As such forensics investigators are increasingly facing cases that involve investigation of STaaS platforms. Therefore, it is essential for cyber investigators to know how to collect, preserve, and analyse evidences of these platforms. In this paper, we describe investigation of three STaaS platforms namely SpiderOak, JustCloud, and pCloud on Windows 8.1 and iOS 8.1.1 devices. Moreover, possible changes on uploaded and downloaded files metadata on these platforms would be tracked and their forensics value would be investigated.
CRMar 17, 2016
Windows Instant Messaging App Forensics: Facebook and Skype as Case StudiesTeing Yee Yang, Ali Dehghantanha, Kim-Kwang Raymond Choo et al.
Instant messaging (IM) has changed the way people communicate with each other. However, the interactive and instant nature of these applications (apps) made them an attractive choice for malicious cyber activities such as phishing. The forensic examination of IM apps for modern Windows 8.1 (or later) has been largely unexplored, as the platform is relatively new. In this paper, we seek to determine the data remnants from the use of two popular Windows Store application software for instant messaging, namely Facebook and Skype on a Windows 8.1 client machine. This research contributes to an in-depth understanding of the types of terrestrial artefacts that are likely to remain after the use of instant messaging services and application software on a contemporary Windows operating system. Potential artefacts detected during the research include data relating to the installation or uninstallation of the instant messaging application software, log-in and log-off information, contact lists, conversations, and transferred files.
CRSep 23, 2015
A Forensically Sound Adversary Model for Mobile DevicesQuang Do, Ben Martini, Kim-Kwang Raymond Choo
In this paper, we propose an adversary model to facilitate forensic investigations of mobile devices (e.g. Android, iOS and Windows smartphones) that can be readily adapted to the latest mobile device technologies. This is essential given the ongoing and rapidly changing nature of mobile device technologies. An integral principle and significant constraint upon forensic practitioners is that of forensic soundness. Our adversary model specifically considers and integrates the constraints of forensic soundness on the adversary, in our case, a forensic practitioner. One construction of the adversary model is an evidence collection and analysis methodology for Android devices. Using the methodology with six popular cloud apps, we were successful in extracting various information of forensic interest in both the external and internal storage of the mobile device.
CRSep 23, 2015
Efficient and Anonymous Two-Factor User Authentication in Wireless Sensor Networks: Achieving User Anonymity with Lightweight Sensor ComputationJunghyun Nam, Kim-Kwang Raymond Choo, Sangchul Han et al.
A smart-card-based user authentication scheme for wireless sensor networks (hereafter referred to as a SCA-WSN scheme) is designed to ensure that only users who possess both a smart card and the corresponding password are allowed to gain access to sensor data and their transmissions. Despite many research efforts in recent years, it remains a challenging task to design an efficient SCA-WSN scheme that achieves user anonymity. The majority of published SCA-WSN schemes use only lightweight cryptographic techniques (rather than public-key cryptographic techniques) for the sake of efficiency, and have been demonstrated to suffer from the inability to provide user anonymity. Some schemes employ elliptic curve cryptography for better security but require sensors with strict resource constraints to perform computationally expensive scalar-point multiplications; despite the increased computational requirements, these schemes do not provide user anonymity. In this paper, we present a new SCA-WSN scheme that not only achieves user anonymity but also is efficient in terms of the computation loads for sensors. Our scheme employs elliptic curve cryptography but restricts its use only to anonymous user-to-gateway authentication, thereby allowing sensors to perform only lightweight cryptographic operations. Our scheme also enjoys provable security in a formal model extended from the widely accepted Bellare-Pointcheval-Rogaway (2000) model to capture the user anonymity property and various SCA-WSN specific attacks (e.g., stolen smart card attacks, node capture attacks, privileged insider attacks, and stolen verifier attacks).
CYJun 18, 2015
Mobile Cloud Forensics: An Analysis of Seven Popular Android AppsBen Martini, Quang Do, Kim-Kwang Raymond Choo
Using the evidence collection and analysis methodology for Android devices proposed by Martini, Do and Choo, we examined and analyzed seven popular Android cloud-based apps. Firstly, we analyzed each app in order to see what information could be obtained from their private app storage and SD card directories. We collated the information and used it to aid our investigation of each app database files and AccountManager data. To complete our understanding of the forensic artefacts stored by apps we analyzed, we performed further analysis on the apps to determine if the user authentication credentials could be collected for each app based on the information gained in the initial analysis stages. The contributions of this research include a detailed description of artefacts, which are of general forensic interest, for each app analyzed.