Sanjay Jha

CR
h-index67
29papers
998citations
Novelty39%
AI Score46

29 Papers

SDAug 24, 2023
Generalizable Zero-Shot Speaker Adaptive Speech Synthesis with Disentangled Representations

Wenbin Wang, Yang Song, Sanjay Jha

While most research into speech synthesis has focused on synthesizing high-quality speech for in-dataset speakers, an equally essential yet unsolved problem is synthesizing speech for unseen speakers who are out-of-dataset with limited reference data, i.e., speaker adaptive speech synthesis. Many studies have proposed zero-shot speaker adaptive text-to-speech and voice conversion approaches aimed at this task. However, most current approaches suffer from the degradation of naturalness and speaker similarity when synthesizing speech for unseen speakers (i.e., speakers not in the training dataset) due to the poor generalizability of the model in out-of-distribution data. To address this problem, we propose GZS-TV, a generalizable zero-shot speaker adaptive text-to-speech and voice conversion model. GZS-TV introduces disentangled representation learning for both speaker embedding extraction and timbre transformation to improve model generalization and leverages the representation learning capability of the variational autoencoder to enhance the speaker encoder. Our experiments demonstrate that GZS-TV reduces performance degradation on unseen speakers and outperforms all baseline models in multiple datasets.

MMSep 19, 2022
AutoLV: Automatic Lecture Video Generator

Wenbin Wang, Yang Song, Sanjay Jha

We propose an end-to-end lecture video generation system that can generate realistic and complete lecture videos directly from annotated slides, instructor's reference voice and instructor's reference portrait video. Our system is primarily composed of a speech synthesis module with few-shot speaker adaptation and an adversarial learning-based talking-head generation module. It is capable of not only reducing instructors' workload but also changing the language and accent which can help the students follow the lecture more easily and enable a wider dissemination of lecture contents. Our experimental results show that the proposed model outperforms other current approaches in terms of authenticity, naturalness and accuracy. Here is a video demonstration of how our system works, and the outcomes of the evaluation and comparison: https://youtu.be/cY6TYkI0cog.

CRJul 13, 2022
PhishSim: Aiding Phishing Website Detection with a Feature-Free Tool

Rizka Purwanto, Arindam Pal, Alan Blair et al.

In this paper, we propose a feature-free method for detecting phishing websites using the Normalized Compression Distance (NCD), a parameter-free similarity measure which computes the similarity of two websites by compressing them, thus eliminating the need to perform any feature extraction. It also removes any dependence on a specific set of website features. This method examines the HTML of webpages and computes their similarity with known phishing websites, in order to classify them. We use the Furthest Point First algorithm to perform phishing prototype extractions, in order to select instances that are representative of a cluster of phishing webpages. We also introduce the use of an incremental learning algorithm as a framework for continuous and adaptive detection without extracting new features when concept drift occurs. On a large dataset, our proposed method significantly outperforms previous methods in detecting phishing websites, with an AUC score of 98.68%, a high true positive rate (TPR) of around 90%, while maintaining a low false positive rate (FPR) of 0.58%. Our approach uses prototypes, eliminating the need to retain long term data in the future, and is feasible to deploy in real systems with a processing time of roughly 0.3 seconds.

LGMar 13Code
Privacy-Preserving Machine Learning for IoT: A Cross-Paradigm Survey and Future Roadmap

Zakia Zaman, Praveen Gauravaram, Mahbub Hassan et al.

The rapid proliferation of the Internet of Things has intensified demand for robust privacy-preserving machine learning mechanisms to safeguard sensitive data generated by large-scale, heterogeneous, and resource-constrained devices. Unlike centralized environments, IoT ecosystems are inherently decentralized, bandwidth-limited, and latency-sensitive, exposing privacy risks across sensing, communication, and distributed training pipelines. These characteristics render conventional anonymization and centralized protection strategies insufficient for practical deployments. This survey presents a comprehensive IoT-centric, cross-paradigm analysis of privacy-preserving machine learning. We introduce a structured taxonomy spanning perturbation-based mechanisms such as differential privacy, distributed paradigms such as federated learning, cryptographic approaches including homomorphic encryption and secure multiparty computation, and generative synthesis techniques based on generative adversarial networks. For each paradigm, we examine formal privacy guarantees, computational and communication complexity, scalability under heterogeneous device participation, and resilience against threats including membership inference, model inversion, gradient leakage, and adversarial manipulation. We further analyze deployment constraints in wireless IoT environments, highlighting trade-offs between privacy, communication overhead, model convergence, and system efficiency within next-generation mobile architectures. We also consolidate evaluation methodologies, summarize representative datasets and open-source frameworks, and identify open challenges including hybrid privacy integration, energy-aware learning, privacy-preserving large language models, and quantum-resilient machine learning.

CRFeb 2
Human Society-Inspired Approaches to Agentic AI Security: The 4C Framework

Alsharif Abuadbba, Nazatul Sultan, Surya Nepal et al.

AI is moving from domain-specific autonomy in closed, predictable settings to large-language-model-driven agents that plan and act in open, cross-organizational environments. As a result, the cybersecurity risk landscape is changing in fundamental ways. Agentic AI systems can plan, act, collaborate, and persist over time, functioning as participants in complex socio-technical ecosystems rather than as isolated software components. Although recent work has strengthened defenses against model and pipeline level vulnerabilities such as prompt injection, data poisoning, and tool misuse, these system centric approaches may fail to capture risks that arise from autonomy, interaction, and emergent behavior. This article introduces the 4C Framework for multi-agent AI security, inspired by societal governance. It organizes agentic risks across four interdependent dimensions: Core (system, infrastructure, and environmental integrity), Connection (communication, coordination, and trust), Cognition (belief, goal, and reasoning integrity), and Compliance (ethical, legal, and institutional governance). By shifting AI security from a narrow focus on system-centric protection to the broader preservation of behavioral integrity and intent, the framework complements existing AI security strategies and offers a principled foundation for building agentic AI systems that are trustworthy, governable, and aligned with human values.

LGMay 14, 2022
Fake News Quick Detection on Dynamic Heterogeneous Information Networks

Jin Ho Go, Alina Sari, Jiaojiao Jiang et al.

The spread of fake news has caused great harm to society in recent years. So the quick detection of fake news has become an important task. Some current detection methods often model news articles and other related components as a static heterogeneous information network (HIN) and use expensive message-passing algorithms. However, in the real-world, quickly identifying fake news is of great significance and the network may vary over time in terms of dynamic nodes and edges. Therefore, in this paper, we propose a novel Dynamic Heterogeneous Graph Neural Network (DHGNN) for fake news quick detection. More specifically, we first implement BERT and fine-tuned BERT to get a semantic representation of the news article contents and author profiles and convert it into graph data. Then, we construct the heterogeneous news-author graph to reflect contextual information and relationships. Additionally, we adapt ideas from personalized PageRank propagation and dynamic propagation to heterogeneous networks in order to reduce the time complexity of back-propagating through many nodes during training. Experiments on three real-world fake news datasets show that DHGNN can outperform other GNN-based models in terms of both effectiveness and efficiency.

CRSep 25, 2024
Examining the Rat in the Tunnel: Interpretable Multi-Label Classification of Tor-based Malware

Ishan Karunanayake, Mashael AlSabah, Nadeem Ahmed et al.

Despite being the most popular privacy-enhancing network, Tor is increasingly adopted by cybercriminals to obfuscate malicious traffic, hindering the identification of malware-related communications between compromised devices and Command and Control (C&C) servers. This malicious traffic can induce congestion and reduce Tor's performance, while encouraging network administrators to block Tor traffic. Recent research, however, demonstrates the potential for accurately classifying captured Tor traffic as malicious or benign. While existing efforts have addressed malware class identification, their performance remains limited, with micro-average precision and recall values around 70%. Accurately classifying specific malware classes is crucial for effective attack prevention and mitigation. Furthermore, understanding the unique patterns and attack vectors employed by different malware classes helps the development of robust and adaptable defence mechanisms. We utilise a multi-label classification technique based on Message-Passing Neural Networks, demonstrating its superiority over previous approaches such as Binary Relevance, Classifier Chains, and Label Powerset, by achieving micro-average precision (MAP) and recall (MAR) exceeding 90%. Compared to previous work, we significantly improve performance by 19.98%, 10.15%, and 59.21% in MAP, MAR, and Hamming Loss, respectively. Next, we employ Explainable Artificial Intelligence (XAI) techniques to interpret the decision-making process within these models. Finally, we assess the robustness of all techniques by crafting adversarial perturbations capable of manipulating classifier predictions and generating false positives and negatives.

CRJul 19, 2024
AuditNet: A Conversational AI-based Security Assistant [DEMO]

Shohreh Deldari, Mohammad Goudarzi, Aditya Joshi et al.

In the age of information overload, professionals across various fields face the challenge of navigating vast amounts of documentation and ever-evolving standards. Ensuring compliance with standards, regulations, and contractual obligations is a critical yet complex task across various professional fields. We propose a versatile conversational AI assistant framework designed to facilitate compliance checking on the go, in diverse domains, including but not limited to network infrastructure, legal contracts, educational standards, environmental regulations, and government policies. By leveraging retrieval-augmented generation using large language models, our framework automates the review, indexing, and retrieval of relevant, context-aware information, streamlining the process of verifying adherence to established guidelines and requirements. This AI assistant not only reduces the manual effort involved in compliance checks but also enhances accuracy and efficiency, supporting professionals in maintaining high standards of practice and ensuring regulatory compliance in their respective fields. We propose and demonstrate AuditNet, the first conversational AI security assistant designed to assist IoT network security experts by providing instant access to security standards, policies, and regulations.

LGOct 15, 2024Code
Adversarially Guided Stateful Defense Against Backdoor Attacks in Federated Deep Learning

Hassan Ali, Surya Nepal, Salil S. Kanhere et al.

Recent works have shown that Federated Learning (FL) is vulnerable to backdoor attacks. Existing defenses cluster submitted updates from clients and select the best cluster for aggregation. However, they often rely on unrealistic assumptions regarding client submissions and sampled clients population while choosing the best cluster. We show that in realistic FL settings, state-of-the-art (SOTA) defenses struggle to perform well against backdoor attacks in FL. To address this, we highlight that backdoored submissions are adversarially biased and overconfident compared to clean submissions. We, therefore, propose an Adversarially Guided Stateful Defense (AGSD) against backdoor attacks on Deep Neural Networks (DNNs) in FL scenarios. AGSD employs adversarial perturbations to a small held-out dataset to compute a novel metric, called the trust index, that guides the cluster selection without relying on any unrealistic assumptions regarding client submissions. Moreover, AGSD maintains a trust state history of each client that adaptively penalizes backdoored clients and rewards clean clients. In realistic FL settings, where SOTA defenses mostly fail to resist attacks, AGSD mostly outperforms all SOTA defenses with minimal drop in clean accuracy (5% in the worst-case compared to best accuracy) even when (a) given a very small held-out dataset -- typically AGSD assumes 50 samples (<= 0.1% of the training data) and (b) no heldout dataset is available, and out-of-distribution data is used instead. For reproducibility, our code will be openly available at: https://github.com/hassanalikhatim/AGSD.

SDApr 28, 2024
USAT: A Universal Speaker-Adaptive Text-to-Speech Approach

Wenbin Wang, Yang Song, Sanjay Jha

Conventional text-to-speech (TTS) research has predominantly focused on enhancing the quality of synthesized speech for speakers in the training dataset. The challenge of synthesizing lifelike speech for unseen, out-of-dataset speakers, especially those with limited reference data, remains a significant and unresolved problem. While zero-shot or few-shot speaker-adaptive TTS approaches have been explored, they have many limitations. Zero-shot approaches tend to suffer from insufficient generalization performance to reproduce the voice of speakers with heavy accents. While few-shot methods can reproduce highly varying accents, they bring a significant storage burden and the risk of overfitting and catastrophic forgetting. In addition, prior approaches only provide either zero-shot or few-shot adaptation, constraining their utility across varied real-world scenarios with different demands. Besides, most current evaluations of speaker-adaptive TTS are conducted only on datasets of native speakers, inadvertently neglecting a vast portion of non-native speakers with diverse accents. Our proposed framework unifies both zero-shot and few-shot speaker adaptation strategies, which we term as "instant" and "fine-grained" adaptations based on their merits. To alleviate the insufficient generalization performance observed in zero-shot speaker adaptation, we designed two innovative discriminators and introduced a memory mechanism for the speech decoder. To prevent catastrophic forgetting and reduce storage implications for few-shot speaker adaptation, we designed two adapters and a unique adaptation procedure.

CRAug 1, 2025
Demo: TOSense -- What Did You Just Agree to?

Xinzhang Chen, Hassan Ali, Arash Shaghaghi et al.

Online services often require users to agree to lengthy and obscure Terms of Service (ToS), leading to information asymmetry and legal risks. This paper proposes TOSense-a Chrome extension that allows users to ask questions about ToS in natural language and get concise answers in real time. The system combines (i) a crawler "tos-crawl" that automatically extracts ToS content, and (ii) a lightweight large language model pipeline: MiniLM for semantic retrieval and BART-encoder for answer relevance verification. To avoid expensive manual annotation, we present a novel Question Answering Evaluation Pipeline (QEP) that generates synthetic questions and verifies the correctness of answers using clustered topic matching. Experiments on five major platforms, Apple, Google, X (formerly Twitter), Microsoft, and Netflix, show the effectiveness of TOSense (with up to 44.5% accuracy) across varying number of topic clusters. During the demonstration, we will showcase TOSense in action. Attendees will be able to experience seamless extraction, interactive question answering, and instant indexing of new sites.

LGAug 27, 2021
Man versus Machine: AutoML and Human Experts' Role in Phishing Detection

Rizka Purwanto, Arindam Pal, Alan Blair et al.

Machine learning (ML) has developed rapidly in the past few years and has successfully been utilized for a broad range of tasks, including phishing detection. However, building an effective ML-based detection system is not a trivial task, and requires data scientists with knowledge of the relevant domain. Automated Machine Learning (AutoML) frameworks have received a lot of attention in recent years, enabling non-ML experts in building a machine learning model. This brings to an intriguing question of whether AutoML can outperform the results achieved by human data scientists. Our paper compares the performances of six well-known, state-of-the-art AutoML frameworks on ten different phishing datasets to see whether AutoML-based models can outperform manually crafted machine learning models. Our results indicate that AutoML-based models are able to outperform manually developed machine learning models in complex classification tasks, specifically in datasets where the features are not quite discriminative, and datasets with overlapping classes or relatively high degrees of non-linearity. Challenges also remain in building a real-world phishing detection system using AutoML frameworks due to the current support only on supervised classification problems, leading to the need for labeled data, and the inability to update the AutoML-based models incrementally. This indicates that experts with knowledge in the domain of phishing and cybersecurity are still essential in the loop of the phishing detection pipeline.

CRMar 10, 2021
DIMY: Enabling Privacy-preserving Contact Tracing

Nadeem Ahmed, Regio A. Michelin, Wanli Xue et al.

The infection rate of COVID-19 and lack of an approved vaccine has forced governments and health authorities to adopt lockdowns, increased testing, and contact tracing to reduce the spread of the virus. Digital contact tracing has become a supplement to the traditional manual contact tracing process. However, although there have been a number of digital contact tracing apps proposed and deployed, these have not been widely adopted owing to apprehensions surrounding privacy and security. In this paper, we propose a blockchain-based privacy-preserving contact tracing protocol, "Did I Meet You" (DIMY), that provides full-lifecycle data privacy protection on the devices themselves as well as on the back-end servers, to address most of the privacy concerns associated with existing protocols. We have employed Bloom filters to provide efficient privacy-preserving storage, and have used the Diffie-Hellman key exchange for secret sharing among the participants. We show that DIMY provides resilience against many well known attacks while introducing negligible overheads. DIMY's footprint on the storage space of clients' devices and back-end servers is also significantly lower than other similar state of the art apps.

CRFeb 3, 2021
All Infections are Not Created Equal: Time-Sensitive Prediction of Malware Generated Network Attacks

Zainab Abaid, Dilip Sarkar, Mohamed Ali Kaafar et al.

Many techniques have been proposed for quickly detecting and containing malware-generated network attacks such as large-scale denial of service attacks; unfortunately, much damage is already done within the first few minutes of an attack, before it is identified and contained. There is a need for an early warning system that can predict attacks before they actually manifest, so that upcoming attacks can be prevented altogether by blocking the hosts that are likely to engage in attacks. However, blocking responses may disrupt legitimate processes on blocked hosts; in order to minimise user inconvenience, it is important to also foretell the time when the predicted attacks will occur, so that only the most urgent threats result in auto-blocking responses, while less urgent ones are first manually investigated. To this end, we identify a typical infection sequence followed by modern malware; modelling this sequence as a Markov chain and training it on real malicious traffic, we are able to identify behaviour most likely to lead to attacks and predict 98\% of real-world spamming and port-scanning attacks before they occur. Moreover, using a Semi-Markov chain model, we are able to foretell the time of upcoming attacks, a novel capability that allows accurately predicting the times of 97% of real-world malware attacks. Our work represents an important and timely step towards enabling flexible threat response models that minimise disruption to legitimate users.

LGDec 14, 2020
HaS-Nets: A Heal and Select Mechanism to Defend DNNs Against Backdoor Attacks for Data Collection Scenarios

Hassan Ali, Surya Nepal, Salil S. Kanhere et al.

We have witnessed the continuing arms race between backdoor attacks and the corresponding defense strategies on Deep Neural Networks (DNNs). Most state-of-the-art defenses rely on the statistical sanitization of the "inputs" or "latent DNN representations" to capture trojan behaviour. In this paper, we first challenge the robustness of such recently reported defenses by introducing a novel variant of targeted backdoor attack, called "low-confidence backdoor attack". We also propose a novel defense technique, called "HaS-Nets". "Low-confidence backdoor attack" exploits the confidence labels assigned to poisoned training samples by giving low values to hide their presence from the defender, both during training and inference. We evaluate the attack against four state-of-the-art defense methods, viz., STRIP, Gradient-Shaping, Februus and ULP-defense, and achieve Attack Success Rate (ASR) of 99%, 63.73%, 91.2% and 80%, respectively. We next present "HaS-Nets" to resist backdoor insertion in the network during training, using a reasonably small healing dataset, approximately 2% to 15% of full training data, to heal the network at each iteration. We evaluate it for different datasets - Fashion-MNIST, CIFAR-10, Consumer Complaint and Urban Sound - and network architectures - MLPs, 2D-CNNs, 1D-CNNs. Our experiments show that "HaS-Nets" can decrease ASRs from over 90% to less than 15%, independent of the dataset, attack configuration and network architecture.

CRSep 28, 2020
De-anonymisation attacks on Tor: A Survey

Ishan Karunanayake, Nadeem Ahmed, Robert Malaney et al.

Anonymity networks are becoming increasingly popular in today's online world as more users attempt to safeguard their online privacy. Tor is currently the most popular anonymity network in use and provides anonymity to both users and services (hidden services). However, the anonymity provided by Tor is also being misused in various ways. Hosting illegal sites for selling drugs, hosting command and control servers for botnets, and distributing censored content are but a few such examples. As a result, various parties, including governments and law enforcement agencies, are interested in attacks that assist in de-anonymising the Tor network, disrupting its operations, and bypassing its censorship circumvention mechanisms. In this survey paper, we review known Tor attacks and identify current techniques for the de-anonymisation of Tor users and hidden services. We discuss these techniques and analyse the practicality of their execution method. We conclude by discussing improvements to the Tor framework that help prevent the surveyed de-anonymisation attacks.

CRJul 22, 2020
PhishZip: A New Compression-based Algorithm for Detecting Phishing Websites

Rizka Purwanto, Arindam Pal, Alan Blair et al.

Phishing has grown significantly in the past few years and is predicted to further increase in the future. The dynamics of phishing introduce challenges in implementing a robust phishing detection system and selecting features which can represent phishing despite the change of attack. In this paper, we propose PhishZip which is a novel phishing detection approach using a compression algorithm to perform website classification and demonstrate a systematic way to construct the word dictionaries for the compression models using word occurrence likelihood analysis. PhishZip outperforms the use of best-performing HTML-based features in past studies, with a true positive rate of 80.04%. We also propose the use of compression ratio as a novel machine learning feature which significantly improves machine learning based phishing detection over previous studies. Using compression ratios as additional features, the true positive rate significantly improves by 30.3% (from 51.47% to 81.77%), while the accuracy increases by 11.84% (from 71.20% to 83.04%).

CRJul 20, 2020
B-FERL: Blockchain based Framework for Securing Smart Vehicles

Chuka Oham, Regio Michelin, Salil S. Kanhere et al.

The ubiquity of connecting technologies in smart vehicles and the incremental automation of its functionalities promise significant benefits, including a significant decline in congestion and road fatalities. However, increasing automation and connectedness broadens the attack surface and heightens the likelihood of a malicious entity successfully executing an attack. In this paper, we propose a Blockchain based Framework for sEcuring smaRt vehicLes (B-FERL). B-FERL uses permissioned blockchain technology to tailor information access to restricted entities in the connected vehicle ecosystem. It also uses a challenge-response data exchange between the vehicles and roadside units to monitor the internal state of the vehicle to identify cases of in-vehicle network compromise. In order to enable authentic and valid communication in the vehicular network, only vehicles with a verifiable record in the blockchain can exchange messages. Through qualitative arguments, we show that B-FERL is resilient to identified attacks. Also, quantitative evaluations in an emulated scenario show that B-FERL ensures a suitable response time and required storage size compatible with realistic scenarios. Finally, we demonstrate how B-FERL achieves various important functions relevant to the automotive ecosystem such as trust management, vehicular forensics and secure vehicular networks.

CRJun 18, 2020
A Survey of COVID-19 Contact Tracing Apps

Nadeem Ahmed, Regio A. Michelin, Wanli Xue et al.

The recent outbreak of COVID-19 has taken the world by surprise, forcing lockdowns and straining public health care systems. COVID-19 is known to be a highly infectious virus, and infected individuals do not initially exhibit symptoms, while some remain asymptomatic. Thus, a non-negligible fraction of the population can, at any given time, be a hidden source of transmissions. In response, many governments have shown great interest in smartphone contact tracing apps that help automate the difficult task of tracing all recent contacts of newly identified infected individuals. However, tracing apps have generated much discussion around their key attributes, including system architecture, data management, privacy, security, proximity estimation, and attack vulnerability. In this article, we provide the first comprehensive review of these much-discussed tracing app attributes. We also present an overview of many proposed tracing app examples, some of which have been deployed countrywide, and discuss the concerns users have reported regarding their usage. We close by outlining potential research directions for next-generation app design, which would facilitate improved tracing and security performance, as well as wide adoption by the population at large.

CRMay 16, 2020
Health Access Broker: Secure, Patient-Controlled Management of Personal Health Records in the Cloud

Zainab Abaid, Arash Shaghaghi, Ravin Gunawardena et al.

Secure and privacy-preserving management of Personal Health Records (PHRs) has proved to be a major challenge in modern healthcare. Current solutions generally do not offer patients a choice in where the data is actually stored and also rely on at least one fully trusted element that patients must also trust with their data. In this work, we present the Health Access Broker (HAB), a patient-controlled service for secure PHR sharing that (a) does not impose a specific storage location (uniquely for a PHR system), and (b) does not assume any of its components to be fully secure against adversarial threats. Instead, HAB introduces a novel auditing and intrusion-detection mechanism where its workflow is securely logged and continuously inspected to provide auditability of data access and quickly detect any intrusions.

CRDec 23, 2019
Leveraging lightweight blockchain to establish data integrity for surveillance cameras

Regio A. Michelin, Nadeem Ahmed, Salil S. Kanhere et al.

The video footage produced by the surveillance cameras is an important evidence to support criminal investigations. Video evidence can be sourced from public (trusted) as well as private (untrusted) surveillance systems. This raises the issue of establishing integrity and auditability for information provided by the untrusted video sources. In this paper, we focus on a airport ecosystem, where multiple entities with varying levels of trust are involved in producing and exchanging video surveillance information. We present a framework to ensure the data integrity of the stored videos, allowing authorities to validate whether video footage has not been tampered. Our proposal uses a lightweight blockchain technology to store the video metadata as blockchain transactions to support the validation of video integrity. The proposed framework also ensures video auditability and non-repudiation. Our evaluations show that the overhead introduced by employing the blockchain to create and query the transactions introduces a very minor latency of a few milliseconds.

CRMay 27, 2019
Risk Analysis Study of Fully Autonomous Vehicle

Chuka Oham, Raja Jurdak, Sanjay Jha

Fully autonomous vehicles are emerging vehicular technologies that have gained significant attention in todays research endeavours. Even though it promises to optimize road safety, the proliferation of wireless and sensor technologies makes it susceptible to cyber threats thus dawdling its adoption. The identification of threats and design of apposite security solutions is therefore pertinent to expedite its adoption. In this paper, we analyse the security risks of the communication infrastructure for the fully autonomous vehicle using a subset of the TVRA methodology by ETSI. We described the model of communication infrastructure. This model clarifies the potential communication possibilities of the vehicle. Then we defined the security objectives and identified threats. Furthermore, we classified risks and propose countermeasures to facilitate the design of security solutions. We find that all identified high impact threats emanates from a particular source and required encryption mechanisms as countermeasures. Finally, we discovered that all threats due to an interaction with humans are of serious consequences.

CRSep 19, 2018
Gwardar: Towards Protecting a Software-Defined Network from Malicious Network Operating Systems

Arash Shaghaghi, Salil S. Kanhere, Mohamed Ali Kaafar et al.

A Software-Defined Network (SDN) controller (aka. Network Operating System or NOS) is regarded as the brain of the network and is the single most critical element responsible to manage an SDN. Complimentary to existing solutions that aim to protect a NOS, we propose an intrusion protection system designed to protect an SDN against a controller that has been successfully compromised. Gwardar maintains a virtual replica of the data plane by intercepting the OpenFlow messages exchanged between the control and data plane. By observing the long-term flow of the packets, Gwardar learns the normal set of trajectories in the data plane for distinct packet headers. Upon detecting an unexpected packet trajectory, it starts by verifying the data plane forwarding devices by comparing the actual packet trajectories with the expected ones computed over the virtual replica. If the anomalous trajectories match the NOS instructions, Gwardar inspects the NOS itself. For this, it submits policies matching the normal set of trajectories and verifies whether the controller submits matching flow rules to the data plane and whether the network view provided to the application plane reflects the changes. Our evaluation results prove the practicality of Gwardar with a high detection accuracy in a reasonable time-frame.

CRJul 7, 2018
Gargoyle: A Network-based Insider Attack Resilient Framework for Organizations

Arash Shaghaghi, Salil S. Kanhere, Mohamed Ali Kaafar et al.

`Anytime, Anywhere' data access model has become a widespread IT policy in organizations making insider attacks even more complicated to model, predict and deter. Here, we propose Gargoyle, a network-based insider attack resilient framework against the most complex insider threats within a pervasive computing context. Compared to existing solutions, Gargoyle evaluates the trustworthiness of an access request context through a new set of contextual attributes called Network Context Attribute (NCA). NCAs are extracted from the network traffic and include information such as the user's device capabilities, security-level, current and prior interactions with other devices, network connection status, and suspicious online activities. Retrieving such information from the user's device and its integrated sensors are challenging in terms of device performance overheads, sensor costs, availability, reliability and trustworthiness. To address these issues, Gargoyle leverages the capabilities of Software-Defined Network (SDN) for both policy enforcement and implementation. In fact, Gargoyle's SDN App can interact with the network controller to create a `defence-in-depth' protection system. For instance, Gargoyle can automatically quarantine a suspicious data requestor in the enterprise network for further investigation or filter out an access request before engaging a data provider. Finally, instead of employing simplistic binary rules in access authorizations, Gargoyle incorporates Function-based Access Control (FBAC) and supports the customization of access policies into a set of functions (e.g., disabling copy, allowing print) depending on the perceived trustworthiness of the context.

CYJun 16, 2018
B-FICA: BlockChain based Framework for Auto-insurance Claim and Adjudication

Chuka Oham, Raja Jurdak, Salil S. Kanhere et al.

In this paper, we propose a partitioned BlockChain based Framework for Auto-insurance Claims and Adjudication (B-FICA) for CAVs that tracks both sensor data and entity interactions with two-sided verification. B-FICA uses permissioned BC with two partitions to share information on a need to know basis. It also uses multi-signed transactions for proof of execution of instructions, for reliability and auditability and also uses a dynamic lightweight consensus and validation protocol to prevent evidence alteration. Qualitative evaluation shows that B-FICA is resilient to several security attacks from potential liable entities. Finally, simulations show that compared to the state of the art, B-FICA reduces processing time and its delay overhead is negligible for practical scenarios and at marginal security cost.

CRApr 1, 2018
Software-Defined Network (SDN) Data Plane Security: Issues, Solutions and Future Directions

Arash Shaghaghi, Mohamed Ali Kaafar, Rajkumar Buyya et al.

Software-Defined Network (SDN) radically changes the network architecture by decoupling the network logic from the underlying forwarding devices. This architectural change rejuvenates the network-layer granting centralized management and re-programmability of the networks. From a security perspective, SDN separates security concerns into control and data plane, and this architectural recomposition brings up exciting opportunities and challenges. The overall perception is that SDN capabilities will ultimately result in improved security. However, in its raw form, SDN could potentially make networks more vulnerable to attacks and harder to protect. In this paper, we focus on identifying challenges faced in securing the data plane of SDN - one of the least explored but most critical components of this technology. We formalize this problem space, identify potential attack scenarios while highlighting possible vulnerabilities and establish a set of requirements and challenges to protect the data plane of SDNs. Moreover, we undertake a survey of existing solutions with respect to the identified threats, identifying their limitations and offer future research directions.

CRFeb 14, 2018
A Blockchain Based Liability Attribution Framework for Autonomous Vehicles

Chuka Oham, Salil S. Kanhere, Raja Jurdak et al.

The advent of autonomous vehicles is envisaged to disrupt the auto insurance liability model.Compared to the the current model where liability is largely attributed to the driver,autonomous vehicles necessitate the consideration of other entities in the automotive ecosystem including the auto manufacturer,software provider,service technician and the vehicle owner.The proliferation of sensors and connecting technologies in autonomous vehicles enables an autonomous vehicle to gather sufficient data for liability attribution,yet increased connectivity exposes the vehicle to attacks from interacting entities.These possibilities motivate potential liable entities to repudiate their involvement in a collision event to evade liability. While the data collected from vehicular sensors and vehicular communications is an integral part of the evidence for arbitrating liability in the event of an accident,there is also a need to record all interactions between the aforementioned entities to identify potential instances of negligence that may have played a role in the accident.In this paper,we propose a BlockChain(BC) based framework that integrates the concerned entities in the liability model and provides untampered evidence for liability attribution and adjudication.We first describe the liability attribution model, identify key requirements and describe the adversarial capabilities of entities. Also,we present a detailed description of data contributing to evidence.Our framework uses permissioned BC and partitions the BC to tailor data access to relevant BC participants.Finally,we conduct a security analysis to verify that the identified requirements are met and resilience of our proposed framework to identified attacks.

CRAug 18, 2017
WedgeTail: An Intrusion Prevention System for the Data Plane of Software Defined Networks

Arash Shaghaghi, Mohamed Ali Kaafar, Sanjay Jha

Networks are vulnerable to disruptions caused by malicious forwarding devices. The situation is likely to worsen in Software Defined Networks (SDNs) with the incompatibility of existing solutions, use of programmable soft switches and the potential of bringing down an entire network through compromised forwarding devices. In this paper, we present WedgeTail, an Intrusion Prevention System (IPS) designed to secure the SDN data plane. WedgeTail regards forwarding devices as points within a geometric space and stores the path packets take when traversing the network as trajectories. To be efficient, it prioritizes forwarding devices before inspection using an unsupervised trajectory-based sampling mechanism. For each of the forwarding device, WedgeTail computes the expected and actual trajectories of packets and `hunts' for any forwarding device not processing packets as expected. Compared to related work, WedgeTail is also capable of distinguishing between malicious actions such as packet drop and generation. Moreover, WedgeTail employs a radically different methodology that enables detecting threats autonomously. In fact, it has no reliance on pre-defined rules by an administrator and may be easily imported to protect SDN networks with different setups, forwarding devices, and controllers. We have evaluated WedgeTail in simulated environments, and it has been capable of detecting and responding to all implanted malicious forwarding devices within a reasonable time-frame. We report on the design, implementation, and evaluation of WedgeTail in this manuscript.

CROct 8, 2016
Towards Policy Enforcement Point as a Service (PEPS)

Arash Shaghaghi, Mohamed Ali, Kaafar et al.

In this paper, we coin the term Policy Enforcement as a Service (PEPS), which enables the provision of innovative inter-layer and inter-domain Access Control. We leverage the architecture of Software-Defined-Network (SDN) to introduce a common network-level enforcement point, which is made available to a range of access control systems. With our PEPS model, it is possible to have a `defense in depth' protection model and drop unsuccessful access requests before engaging the data provider (e.g. a database system). Moreover, the current implementation of access control within the `trusted' perimeter of an organization is no longer a restriction so that the potential for novel, distributed and cooperative security services can be realized. We conduct an analysis of the security requirements and technical challenges for implementing Policy Enforcement as a Service. To illustrate the benefits of our proposal in practice, we include a report on our prototype PEPS-enabled location-based access control.