CRApr 25, 2023
Blockchain-based Federated Learning with Secure Aggregation in Trusted Execution Environment for Internet-of-ThingsAditya Pribadi Kalapaaking, Ibrahim Khalil, Mohammad Saidur Rahman et al.
This paper proposes a blockchain-based Federated Learning (FL) framework with Intel Software Guard Extension (SGX)-based Trusted Execution Environment (TEE) to securely aggregate local models in Industrial Internet-of-Things (IIoTs). In FL, local models can be tampered with by attackers. Hence, a global model generated from the tampered local models can be erroneous. Therefore, the proposed framework leverages a blockchain network for secure model aggregation. Each blockchain node hosts an SGX-enabled processor that securely performs the FL-based aggregation tasks to generate a global model. Blockchain nodes can verify the authenticity of the aggregated model, run a blockchain consensus mechanism to ensure the integrity of the model, and add it to the distributed ledger for tamper-proof storage. Each cluster can obtain the aggregated model from the blockchain and verify its integrity before using it. We conducted several experiments with different CNN models and datasets to evaluate the performance of the proposed framework.
LGMay 23, 2024
Dynamic Graph Unlearning: A General and Efficient Post-Processing Method via Gradient TransformationHe Zhang, Bang Wu, Xiangwen Yang et al.
Dynamic graph neural networks (DGNNs) have emerged and been widely deployed in various web applications (e.g., Reddit) to serve users (e.g., personalized content delivery) due to their remarkable ability to learn from complex and dynamic user interaction data. Despite benefiting from high-quality services, users have raised privacy concerns, such as misuse of personal data (e.g., dynamic user-user/item interaction) for model training, requiring DGNNs to ``forget'' their data to meet AI governance laws (e.g., the ``right to be forgotten'' in GDPR). However, current static graph unlearning studies cannot \textit{unlearn dynamic graph elements} and exhibit limitations such as the model-specific design or reliance on pre-processing, which disenable their practicability in dynamic graph unlearning. To this end, we study the dynamic graph unlearning for the first time and propose an effective, efficient, general, and post-processing method to implement DGNN unlearning. Specifically, we first formulate dynamic graph unlearning in the context of continuous-time dynamic graphs, and then propose a method called Gradient Transformation that directly maps the unlearning request to the desired parameter update. Comprehensive evaluations on six real-world datasets and state-of-the-art DGNN backbones demonstrate its effectiveness (e.g., limited drop or obvious improvement in utility) and efficiency (e.g., 7.23$\times$ speed-up) advantages. Additionally, our method has the potential to handle future unlearning requests with significant performance gains (e.g., 32.59$\times$ speed-up).
CVNov 5, 2024
Gradient-Guided Conditional Diffusion Models for Private Image Reconstruction: Analyzing Adversarial Impacts of Differential Privacy and DenoisingTao Huang, Jiayang Meng, Hong Chen et al.
We investigate the construction of gradient-guided conditional diffusion models for reconstructing private images, focusing on the adversarial interplay between differential privacy noise and the denoising capabilities of diffusion models. While current gradient-based reconstruction methods struggle with high-resolution images due to computational complexity and prior knowledge requirements, we propose two novel methods that require minimal modifications to the diffusion model's generation process and eliminate the need for prior knowledge. Our approach leverages the strong image generation capabilities of diffusion models to reconstruct private images starting from randomly generated noise, even when a small amount of differentially private noise has been added to the gradients. We also conduct a comprehensive theoretical analysis of the impact of differential privacy noise on the quality of reconstructed images, revealing the relationship among noise magnitude, the architecture of attacked models, and the attacker's reconstruction capability. Additionally, extensive experiments validate the effectiveness of our proposed methods and the accuracy of our theoretical findings, suggesting new directions for privacy risk auditing using conditional diffusion models.
CLMar 19, 2025
Deep Contrastive Unlearning for Language ModelsEstrid He, Tabinda Sarwar, Ibrahim Khalil et al.
The past a few years have witnessed the great success of large language models, demonstrating powerful capabilities in comprehending textual data and generating human-like languages. Large language models achieve success by being trained on vast amounts of textual data, including online sources with copyrighted content and user-generated knowledge. However, this comes at a cost: the potential risk of exposing users' privacy and violating copyright protections. Thus, to safeguard individuals' "right to be forgotten", there has been increasing interests in machine unlearning -- the process of removing information carried by particular training samples from a model while not deteriorating its predictive quality. This is a challenging task due to the black-box nature of language models. Most existing studies focus on mitigating the impact of those forgot samples upon a model's outputs, and do not explicitly consider the geometric distributions of samples in the latent space of a model. To address this issue, we propose a machine unlearning framework, named Deep Contrastive Unlearning for fine-Tuning (DeepCUT) language models. Our proposed model achieves machine unlearning by directly optimizing the latent space of a model. Comprehensive experiments on real-world datasets demonstrate the effectiveness and efficiency of DeepCUT with consistent and significant improvement over baseline methods.
CYNov 2, 2024
Cloned Identity Detection in Social-Sensor Clouds based on Incomplete ProfilesAhmed Alharbi, Hai Dong, Xun Yi et al.
We propose a novel approach to effectively detect cloned identities of social-sensor cloud service providers (i.e. social media users) in the face of incomplete non-privacy-sensitive profile data. Named ICD-IPD, the proposed approach first extracts account pairs with similar usernames or screen names from a given set of user accounts collected from a social media. It then learns a multi-view representation associated with a given account and extracts two categories of features for every single account. These two categories of features include profile and Weighted Generalised Canonical Correlation Analysis (WGCCA)-based features that may potentially contain missing values. To counter the impact of such missing values, a missing value imputer will next impute the missing values of the aforementioned profile and WGCCA-based features. After that, the proposed approach further extracts two categories of augmented features for each account pair identified previously, namely, 1) similarity and 2) differences-based features. Finally, these features are concatenated and fed into a Light Gradient Boosting Machine classifier to detect identity cloning. We evaluated and compared the proposed approach against the existing state-of-the-art identity cloning approaches and other machine or deep learning models atop a real-world dataset. The experimental results show that the proposed approach outperforms the state-of-the-art approaches and models in terms of Precision, Recall and F1-score.
CROct 8, 2025
Unsupervised Backdoor Detection and Mitigation for Spiking Neural NetworksJiachen Li, Bang Wu, Xiaoyu Xia et al.
Spiking Neural Networks (SNNs) have gained increasing attention for their superior energy efficiency compared to Artificial Neural Networks (ANNs). However, their security aspects, particularly under backdoor attacks, have received limited attention. Existing defense methods developed for ANNs perform poorly or can be easily bypassed in SNNs due to their event-driven and temporal dependencies. This paper identifies the key blockers that hinder traditional backdoor defenses in SNNs and proposes an unsupervised post-training detection framework, Temporal Membrane Potential Backdoor Detection (TMPBD), to overcome these challenges. TMPBD leverages the maximum margin statistics of temporal membrane potential (TMP) in the final spiking layer to detect target labels without any attack knowledge or data access. We further introduce a robust mitigation mechanism, Neural Dendrites Suppression Backdoor Mitigation (NDSBM), which clamps dendritic connections between early convolutional layers to suppress malicious neurons while preserving benign behaviors, guided by TMP extracted from a small, clean, unlabeled dataset. Extensive experiments on multiple neuromorphic benchmarks and state-of-the-art input-aware dynamic trigger attacks demonstrate that TMPBD achieves 100% detection accuracy, while NDSBM reduces the attack success rate from 100% to 8.44%, and to 2.81% when combined with detection, without degrading clean accuracy.
CRAug 13, 2025
Social-Sensor Identity Cloning Detection Using Weakly Supervised Deep Forest and Cryptographic AuthenticationAhmed Alharbi, Hai Dong, Xun Yi
Recent years have witnessed a rising trend in social-sensor cloud identity cloning incidents. However, existing approaches suffer from unsatisfactory performance, a lack of solutions for detecting duplicated accounts, and a lack of large-scale evaluations on real-world datasets. We introduce a novel method for detecting identity cloning in social-sensor cloud service providers. Our proposed technique consists of two primary components: 1) a similar identity detection method and 2) a cryptography-based authentication protocol. Initially, we developed a weakly supervised deep forest model to identify similar identities using non-privacy-sensitive user profile features provided by the service. Subsequently, we designed a cryptography-based authentication protocol to verify whether similar identities were generated by the same provider. Our extensive experiments on a large real-world dataset demonstrate the feasibility and superior performance of our technique compared to current state-of-the-art identity clone detection methods.
LGNov 5, 2024
Enhancing DP-SGD through Non-monotonous Adaptive Scaling Gradient WeightTao Huang, Qingyu Huang, Xin Shi et al.
In the domain of deep learning, the challenge of protecting sensitive data while maintaining model utility is significant. Traditional Differential Privacy (DP) techniques such as Differentially Private Stochastic Gradient Descent (DP-SGD) typically employ strategies like direct or per-sample adaptive gradient clipping. These methods, however, compromise model accuracy due to their critical influence on gradient handling, particularly neglecting the significant contribution of small gradients during later training stages. In this paper, we introduce an enhanced version of DP-SGD, named Differentially Private Per-sample Adaptive Scaling Clipping (DP-PSASC). This approach replaces traditional clipping with non-monotonous adaptive gradient scaling, which alleviates the need for intensive threshold setting and rectifies the disproportionate weighting of smaller gradients. Our contribution is twofold. First, we develop a novel gradient scaling technique that effectively assigns proper weights to gradients, particularly small ones, thus improving learning under differential privacy. Second, we integrate a momentum-based method into DP-PSASC to reduce bias from stochastic sampling, enhancing convergence rates. Our theoretical and empirical analyses confirm that DP-PSASC preserves privacy and delivers superior performance across diverse datasets, setting new standards for privacy-sensitive applications.
LGJun 24, 2024
Machine Unlearning with Minimal Gradient Dependence for High Unlearning RatiosTao Huang, Ziyang Chen, Jiayang Meng et al.
In the context of machine unlearning, the primary challenge lies in effectively removing traces of private data from trained models while maintaining model performance and security against privacy attacks like membership inference attacks. Traditional gradient-based unlearning methods often rely on extensive historical gradients, which becomes impractical with high unlearning ratios and may reduce the effectiveness of unlearning. Addressing these limitations, we introduce Mini-Unlearning, a novel approach that capitalizes on a critical observation: unlearned parameters correlate with retrained parameters through contraction mapping. Our method, Mini-Unlearning, utilizes a minimal subset of historical gradients and leverages this contraction mapping to facilitate scalable, efficient unlearning. This lightweight, scalable method significantly enhances model accuracy and strengthens resistance to membership inference attacks. Our experiments demonstrate that Mini-Unlearning not only works under higher unlearning ratios but also outperforms existing techniques in both accuracy and security, offering a promising solution for applications requiring robust unlearning capabilities.
CRMay 16, 2023
Trustworthy Privacy-preserving Hierarchical Ensemble and Federated Learning in Healthcare 4.0 with BlockchainVeronika Stephanie, Ibrahim Khalil, Mohammed Atiquzzaman et al.
The advancement of Internet and Communication Technologies (ICTs) has led to the era of Industry 4.0. This shift is followed by healthcare industries creating the term Healthcare 4.0. In Healthcare 4.0, the use of IoT-enabled medical imaging devices for early disease detection has enabled medical practitioners to increase healthcare institutions' quality of service. However, Healthcare 4.0 is still lagging in Artificial Intelligence and big data compared to other Industry 4.0 due to data privacy concerns. In addition, institutions' diverse storage and computing capabilities restrict institutions from incorporating the same training model structure. This paper presents a secure multi-party computation-based ensemble federated learning with blockchain that enables heterogeneous models to collaboratively learn from healthcare institutions' data without violating users' privacy. Blockchain properties also allow the party to enjoy data integrity without trust in a centralized server while also providing each healthcare institution with auditability and version control capability.
CRFeb 4, 2022
Aggregation Service for Federated Learning: An Efficient, Secure, and More Resilient RealizationYifeng Zheng, Shangqi Lai, Yi Liu et al.
Federated learning has recently emerged as a paradigm promising the benefits of harnessing rich data from diverse sources to train high quality models, with the salient features that training datasets never leave local devices. Only model updates are locally computed and shared for aggregation to produce a global model. While federated learning greatly alleviates the privacy concerns as opposed to learning with centralized data, sharing model updates still poses privacy risks. In this paper, we present a system design which offers efficient protection of individual model updates throughout the learning procedure, allowing clients to only provide obscured model updates while a cloud server can still perform the aggregation. Our federated learning system first departs from prior works by supporting lightweight encryption and aggregation, and resilience against drop-out clients with no impact on their participation in future rounds. Meanwhile, prior work largely overlooks bandwidth efficiency optimization in the ciphertext domain and the support of security against an actively adversarial cloud server, which we also fully explore in this paper and provide effective and efficient mechanisms. Extensive experiments over several benchmark datasets (MNIST, CIFAR-10, and CelebA) show our system achieves accuracy comparable to the plaintext baseline, with practical performance.
CRNov 22, 2021
Chaos and Logistic Map based Key Generation Technique for AES-driven IoT SecurityZiaur Rahman, Xun Yi, Ibrahim Khalil et al.
Several efforts have been seen claiming the lightweight block ciphers as a necessarily suitable substitute in securing the Internet of Things. Currently, it has been able to envisage as a pervasive frame of reference almost all across the privacy preserving of smart and sensor-oriented appliances. Different approaches are likely to be inefficient, bringing desired degree of security considering the easiness and surely the process of simplicity but security. Strengthening the well-known symmetric key and block dependent algorithm using either chaos motivated logistic map or elliptic curve has shown a far-reaching potential to be a discretion in secure real-time communication. The popular feature of logistic maps, such as the un-foreseeability and randomness often expected to be used in dynamic key-propagation in sync with chaos and scheduling technique towards data integrity. As a bit alternation in keys, able to come up with oversize deviation, also would have consequence to leverage data confidentiality. Henceforth it may have proximity to time consumption, which may lead to a challenge to make sure instant data exchange between participating node entities. In consideration of delay latency required to both secure encryption and decryption, the proposed approach suggests a modification on the key-origination matrix along with S-box. It has plausibly been taken us to this point that the time required proportionate to the plain-text sent while the plain-text disproportionate to the probability happening a letter on the message made. In line with that the effort so far sought how apparent chaos escalates the desired key-initiation before message transmission.
CRNov 22, 2021
Blockchain for IoT: A Critical Analysis Concerning Performance and ScalabilityZiaur Rahman, Xun Yi, Ibrahim Khalil et al.
The world has been experiencing a mind-blowing expansion of blockchain technology since it was first introduced as an emerging means of cryptocurrency called bitcoin. Currently, it has been regarded as a pervasive frame of reference across almost all research domains, ranging from virtual cash to agriculture or even supply-chain to the Internet of Things. The ability to have a self-administering register with legitimate immutability makes blockchain appealing for the Internet of Things (IoT). As billions of IoT devices are now online in distributed fashion, the huge challenges and questions require to addressed in pursuit of urgently needed solutions. The present paper has been motivated by the aim of facilitating such efforts. The contribution of this work is to figure out those trade-offs the IoT ecosystem usually encounters because of the wrong choice of blockchain technology. Unlike a survey or review, the critical findings of this paper target sorting out specific security challenges of blockchain-IoT Infrastructure. The contribution includes how to direct developers and researchers in this domain to pick out the unblemished combinations of Blockchain enabled IoT applications. In addition, the paper promises to bring a deep insight on Ethereum, Hyperledger blockchain and IOTA technology to show their limitations and prospects in terms of performance and scalability.
SIOct 21, 2021
Privacy-Aware Identity Cloning Detection based on Deep ForestAhmed Alharbi, Hai Dong, Xun Yi et al.
We propose a novel method to detect identity cloning of social-sensor cloud service providers to prevent the detrimental outcomes caused by identity deception. This approach leverages non-privacy-sensitive user profile data gathered from social networks and a powerful deep learning model to perform cloned identity detection. We evaluated the proposed method against the state-of-the-art identity cloning detection techniques and the other popular identity deception detection models atop a real-world dataset. The results show that our method significantly outperforms these techniques/models in terms of Precision and F1-score.
CRJun 24, 2021
Blockchain-based Security Framework for Critical Industry 4.0 Cyber-physical SystemZiaur Rahman, Ibrahim Khalil, Xun Yi et al.
There has been an intense concern for security alternatives because of the recent rise of cyber attacks, mainly targeting critical systems such as industry, medical, or energy ecosystem. Though the latest industry infrastructures largely depend on AI-driven maintenance, the prediction based on corrupted data undoubtedly results in loss of life and capital. Admittedly, an inadequate data-protection mechanism can readily challenge the security and reliability of the network. The shortcomings of the conventional cloud or trusted certificate-driven techniques have motivated us to exhibit a unique Blockchain-based framework for a secure and efficient industry 4.0 system. The demonstrated framework obviates the long-established certificate authority after enhancing the consortium Blockchain that reduces the data processing delay, and increases cost-effective throughput. Nonetheless, the distributed industry 4.0 security model entails cooperative trust than depending on a single party, which in essence indulges the costs and threat of the single point of failure. Therefore, multi-signature technique of the proposed framework accomplishes the multi-party authentication, which confirms its applicability for the real-time and collaborative cyber-physical system.
CRMar 8, 2021
Social Media Identity Deception Detection: A SurveyAhmed Alharbi, Hai Dong, Xun Yi et al.
Social media have been growing rapidly and become essential elements of many people's lives. Meanwhile, social media have also come to be a popular source for identity deception. Many social media identity deception cases have arisen over the past few years. Recent studies have been conducted to prevent and detect identity deception. This survey analyses various identity deception attacks, which can be categorized into fake profile, identity theft and identity cloning. This survey provides a detailed review of social media identity deception detection techniques. It also identifies primary research challenges and issues in the existing detection techniques. This article is expected to benefit both researchers and social media providers.
CRDec 14, 2014
Privacy-Preserving and Outsourced Multi-User k-Means ClusteringBharath K. Samanthula, Fang-Yu Rao, Elisa Bertino et al.
Many techniques for privacy-preserving data mining (PPDM) have been investigated over the past decade. Often, the entities involved in the data mining process are end-users or organizations with limited computing and storage resources. As a result, such entities may want to refrain from participating in the PPDM process. To overcome this issue and to take many other benefits of cloud computing, outsourcing PPDM tasks to the cloud environment has recently gained special attention. We consider the scenario where n entities outsource their databases (in encrypted format) to the cloud and ask the cloud to perform the clustering task on their combined data in a privacy-preserving manner. We term such a process as privacy-preserving and outsourced distributed clustering (PPODC). In this paper, we propose a novel and efficient solution to the PPODC problem based on k-means clustering algorithm. The main novelty of our solution lies in avoiding the secure division operations required in computing cluster centers altogether through an efficient transformation technique. Our solution builds the clusters securely in an iterative fashion and returns the final cluster centers to all entities when a pre-determined termination condition holds. The proposed solution protects data confidentiality of all the participating entities under the standard semi-honest model. To the best of our knowledge, ours is the first work to discuss and propose a comprehensive solution to the PPODC problem that incurs negligible cost on the participating entities. We theoretically estimate both the computation and communication costs of the proposed protocol and also demonstrate its practical value through experiments on a real dataset.