CRApr 25, 2023
Blockchain-based Federated Learning with Secure Aggregation in Trusted Execution Environment for Internet-of-ThingsAditya Pribadi Kalapaaking, Ibrahim Khalil, Mohammad Saidur Rahman et al.
This paper proposes a blockchain-based Federated Learning (FL) framework with Intel Software Guard Extension (SGX)-based Trusted Execution Environment (TEE) to securely aggregate local models in Industrial Internet-of-Things (IIoTs). In FL, local models can be tampered with by attackers. Hence, a global model generated from the tampered local models can be erroneous. Therefore, the proposed framework leverages a blockchain network for secure model aggregation. Each blockchain node hosts an SGX-enabled processor that securely performs the FL-based aggregation tasks to generate a global model. Blockchain nodes can verify the authenticity of the aggregated model, run a blockchain consensus mechanism to ensure the integrity of the model, and add it to the distributed ledger for tamper-proof storage. Each cluster can obtain the aggregated model from the blockchain and verify its integrity before using it. We conducted several experiments with different CNN models and datasets to evaluate the performance of the proposed framework.
CRApr 26, 2023
Blockchain-based Access Control for Secure Smart Industry Management SystemsAditya Pribadi Kalapaaking, Ibrahim Khalil, Mohammad Saidur Rahman et al.
Smart manufacturing systems involve a large number of interconnected devices resulting in massive data generation. Cloud computing technology has recently gained increasing attention in smart manufacturing systems for facilitating cost-effective service provisioning and massive data management. In a cloud-based manufacturing system, ensuring authorized access to the data is crucial. A cloud platform is operated under a single authority. Hence, a cloud platform is prone to a single point of failure and vulnerable to adversaries. An internal or external adversary can easily modify users' access to allow unauthorized users to access the data. This paper proposes a role-based access control to prevent modification attacks by leveraging blockchain and smart contracts in a cloud-based smart manufacturing system. The role-based access control is developed to determine users' roles and rights in smart contracts. The smart contracts are then deployed to the private blockchain network. We evaluate our solution by utilizing Ethereum private blockchain network to deploy the smart contract. The experimental results demonstrate the feasibility and evaluation of the proposed framework's performance.
CVSep 28, 2022
Deep Learning based Automatic Quantification of Urethral Plate Quality using the Plate Objective Scoring Tool (POST)Tariq O. Abbas, Mohamed AbdelMoniem, Ibrahim Khalil et al.
Objectives: To explore the capacity of deep learning algorithm to further streamline and optimize urethral plate (UP) quality appraisal on 2D images using the plate objective scoring tool (POST), aiming to increase the objectivity and reproducibility of UP appraisal in hypospadias repair. Methods: The five key POST landmarks were marked by specialists in a 691-image dataset of prepubertal boys undergoing primary hypospadias repair. This dataset was then used to develop and validate a deep learning-based landmark detection model. The proposed framework begins with glans localization and detection, where the input image is cropped using the predicted bounding box. Next, a deep convolutional neural network (CNN) architecture is used to predict the coordinates of the five POST landmarks. These predicted landmarks are then used to assess UP quality in distal hypospadias. Results: The proposed model accurately localized the glans area, with a mean average precision (mAP) of 99.5% and an overall sensitivity of 99.1%. A normalized mean error (NME) of 0.07152 was achieved in predicting the coordinates of the landmarks, with a mean squared error (MSE) of 0.001 and a 20.2% failure rate at a threshold of 0.1 NME. Conclusions: This deep learning application shows robustness and high precision in using POST to appraise UP quality. Further assessment using international multi-centre image-based databases is ongoing. External validation could benefit deep learning algorithms and lead to better assessments, decision-making and predictions for surgical outcomes.
78.7LGMar 20
GoAgent: Group-of-Agents Communication Topology Generation for LLM-based Multi-Agent SystemsHongjiang Chen, Xin Zheng, Yixin Liu et al.
Large language model (LLM)-based multi-agent systems (MAS) have demonstrated exceptional capabilities in solving complex tasks, yet their effectiveness depends heavily on the underlying communication topology that coordinates agent interactions. Within these systems, successful problem-solving often necessitates task-specific group structures to divide and conquer subtasks. However, most existing approaches generate communication topologies in a node-centric manner, leaving group structures to emerge implicitly from local connectivity decisions rather than modeling them explicitly, often leading to suboptimal coordination and unnecessary communication overhead. To address this limitation, we propose GoAgent (Group-of-Agents), a communication topology generation method that explicitly treats collaborative groups as the atomic units of MAS construction. Specifically, GoAgent first enumerates task-relevant candidate groups through an LLM and then autoregressively selects and connects these groups as atomic units to construct the final communication graph, jointly capturing intra-group cohesion and inter-group coordination. To mitigate communication redundancy and noise propagation inherent in expanding topologies, we further introduce a conditional information bottleneck (CIB) objective that compresses inter-group communication, preserving task-relevant signals while filtering out redundant historical noise. Extensive experiments on six benchmarks demonstrate the state-of-the-art performance of GoAgent with 93.84% average accuracy while reducing token consumption by about 17%.
LGFeb 1Code
WinFLoRA: Incentivizing Client-Adaptive Aggregation in Federated LoRA under Privacy HeterogeneityMengsha Kou, Xiaoyu Xia, Ziqi Wang et al.
Large Language Models (LLMs) increasingly underpin intelligent web applications, from chatbots to search and recommendation, where efficient specialization is essential. Low-Rank Adaptation (LoRA) enables such adaptation with minimal overhead, while federated LoRA allows web service providers to fine-tune shared models without data sharing. However, in privacy-sensitive deployments, clients inject varying levels of differential privacy (DP) noise, creating privacy heterogeneity that misaligns individual incentives and global performance. In this paper, we propose WinFLoRA, a privacy-heterogeneous federated LoRA that utilizes aggregation weights as incentives with noise awareness. Specifically, the noises from clients are estimated based on the uploaded LoRA adapters. A larger weight indicates greater influence on the global model and better downstream task performance, rewarding lower-noise contributions. By up-weighting low-noise updates, WinFLoRA improves global accuracy while accommodating clients' heterogeneous privacy requirements. Consequently, WinFLoRA aligns heterogeneous client utility in terms of privacy and downstream performance with global model objectives without third-party involvement. Extensive evaluations demonstrate that across multiple LLMs and datasets, WinFLoRA achieves up to 52.58% higher global accuracy and up to 2.56x client utility than state-of-the-art benchmarks. Source code is publicly available at https://github.com/koums24/WinFLoRA.git.
LGOct 14, 2024
Edge Unlearning is Not "on Edge"! An Adaptive Exact Unlearning System on Resource-Constrained DevicesXiaoyu Xia, Ziqi Wang, Ruoxi Sun et al.
The right to be forgotten mandates that machine learning models enable the erasure of a data owner's data and information from a trained model. Removing data from the dataset alone is inadequate, as machine learning models can memorize information from the training data, increasing the potential privacy risk to users. To address this, multiple machine unlearning techniques have been developed and deployed. Among them, approximate unlearning is a popular solution, but recent studies report that its unlearning effectiveness is not fully guaranteed. Another approach, exact unlearning, tackles this issue by discarding the data and retraining the model from scratch, but at the cost of considerable computational and memory resources. However, not all devices have the capability to perform such retraining. In numerous machine learning applications, such as edge devices, Internet-of-Things (IoT), mobile devices, and satellites, resources are constrained, posing challenges for deploying existing exact unlearning methods. In this study, we propose a Constraint-aware Adaptive Exact Unlearning System at the network Edge (CAUSE), an approach to enabling exact unlearning on resource-constrained devices. Aiming to minimize the retrain overhead by storing sub-models on the resource-constrained device, CAUSE innovatively applies a Fibonacci-based replacement strategy and updates the number of shards adaptively in the user-based data partition process. To further improve the effectiveness of memory usage, CAUSE leverages the advantage of model pruning to save memory via compression with minimal accuracy sacrifice. The experimental results demonstrate that CAUSE significantly outperforms other representative systems in realizing exact unlearning on the resource-constrained device by 9.23%-80.86%, 66.21%-83.46%, and 5.26%-194.13% in terms of unlearning speed, energy consumption, and accuracy.
CLMar 19, 2025
Deep Contrastive Unlearning for Language ModelsEstrid He, Tabinda Sarwar, Ibrahim Khalil et al.
The past a few years have witnessed the great success of large language models, demonstrating powerful capabilities in comprehending textual data and generating human-like languages. Large language models achieve success by being trained on vast amounts of textual data, including online sources with copyrighted content and user-generated knowledge. However, this comes at a cost: the potential risk of exposing users' privacy and violating copyright protections. Thus, to safeguard individuals' "right to be forgotten", there has been increasing interests in machine unlearning -- the process of removing information carried by particular training samples from a model while not deteriorating its predictive quality. This is a challenging task due to the black-box nature of language models. Most existing studies focus on mitigating the impact of those forgot samples upon a model's outputs, and do not explicitly consider the geometric distributions of samples in the latent space of a model. To address this issue, we propose a machine unlearning framework, named Deep Contrastive Unlearning for fine-Tuning (DeepCUT) language models. Our proposed model achieves machine unlearning by directly optimizing the latent space of a model. Comprehensive experiments on real-world datasets demonstrate the effectiveness and efficiency of DeepCUT with consistent and significant improvement over baseline methods.
CVAug 7, 2025
From Detection to Correction: Backdoor-Resilient Face Recognition via Vision-Language Trigger Detection and Noise-Based NeutralizationFarah Wahida, M. A. P. Chamikara, Yashothara Shanmugarasa et al.
Biometric systems, such as face recognition systems powered by deep neural networks (DNNs), rely on large and highly sensitive datasets. Backdoor attacks can subvert these systems by manipulating the training process. By inserting a small trigger, such as a sticker, make-up, or patterned mask, into a few training images, an adversary can later present the same trigger during authentication to be falsely recognized as another individual, thereby gaining unauthorized access. Existing defense mechanisms against backdoor attacks still face challenges in precisely identifying and mitigating poisoned images without compromising data utility, which undermines the overall reliability of the system. We propose a novel and generalizable approach, TrueBiometric: Trustworthy Biometrics, which accurately detects poisoned images using a majority voting mechanism leveraging multiple state-of-the-art large vision language models. Once identified, poisoned samples are corrected using targeted and calibrated corrective noise. Our extensive empirical results demonstrate that TrueBiometric detects and corrects poisoned images with 100\% accuracy without compromising accuracy on clean images. Compared to existing state-of-the-art approaches, TrueBiometric offers a more practical, accurate, and effective solution for mitigating backdoor attacks in face recognition systems.
LGJun 24, 2024
Machine Unlearning with Minimal Gradient Dependence for High Unlearning RatiosTao Huang, Ziyang Chen, Jiayang Meng et al.
In the context of machine unlearning, the primary challenge lies in effectively removing traces of private data from trained models while maintaining model performance and security against privacy attacks like membership inference attacks. Traditional gradient-based unlearning methods often rely on extensive historical gradients, which becomes impractical with high unlearning ratios and may reduce the effectiveness of unlearning. Addressing these limitations, we introduce Mini-Unlearning, a novel approach that capitalizes on a critical observation: unlearned parameters correlate with retrained parameters through contraction mapping. Our method, Mini-Unlearning, utilizes a minimal subset of historical gradients and leverages this contraction mapping to facilitate scalable, efficient unlearning. This lightweight, scalable method significantly enhances model accuracy and strengthens resistance to membership inference attacks. Our experiments demonstrate that Mini-Unlearning not only works under higher unlearning ratios but also outperforms existing techniques in both accuracy and security, offering a promising solution for applications requiring robust unlearning capabilities.
CRMay 16, 2023
Privacy-Preserving Ensemble Infused Enhanced Deep Neural Network Framework for Edge Cloud ConvergenceVeronika Stephanie, Ibrahim Khalil, Mohammad Saidur Rahman et al.
We propose a privacy-preserving ensemble infused enhanced Deep Neural Network (DNN) based learning framework in this paper for Internet-of-Things (IoT), edge, and cloud convergence in the context of healthcare. In the convergence, edge server is used for both storing IoT produced bioimage and hosting DNN algorithm for local model training. The cloud is used for ensembling local models. The DNN-based training process of a model with a local dataset suffers from low accuracy, which can be improved by the aforementioned convergence and Ensemble Learning. The ensemble learning allows multiple participants to outsource their local model for producing a generalized final model with high accuracy. Nevertheless, Ensemble Learning elevates the risk of leaking sensitive private data from the final model. The proposed framework presents a Differential Privacy-based privacy-preserving DNN with Transfer Learning for a local model generation to ensure minimal loss and higher efficiency at edge server. We conduct several experiments to evaluate the performance of our proposed framework.
CRMay 16, 2023
Trustworthy Privacy-preserving Hierarchical Ensemble and Federated Learning in Healthcare 4.0 with BlockchainVeronika Stephanie, Ibrahim Khalil, Mohammed Atiquzzaman et al.
The advancement of Internet and Communication Technologies (ICTs) has led to the era of Industry 4.0. This shift is followed by healthcare industries creating the term Healthcare 4.0. In Healthcare 4.0, the use of IoT-enabled medical imaging devices for early disease detection has enabled medical practitioners to increase healthcare institutions' quality of service. However, Healthcare 4.0 is still lagging in Artificial Intelligence and big data compared to other Industry 4.0 due to data privacy concerns. In addition, institutions' diverse storage and computing capabilities restrict institutions from incorporating the same training model structure. This paper presents a secure multi-party computation-based ensemble federated learning with blockchain that enables heterogeneous models to collaboratively learn from healthcare institutions' data without violating users' privacy. Blockchain properties also allow the party to enjoy data integrity without trust in a centralized server while also providing each healthcare institution with auditability and version control capability.
CRMay 16, 2023
Smart Policy Control for Securing Federated Learning Management SystemAditya Pribadi Kalapaaking, Ibrahim Khalil, Mohammed Atiquzzaman
The widespread adoption of Internet of Things (IoT) devices in smart cities, intelligent healthcare systems, and various real-world applications have resulted in the generation of vast amounts of data, often analyzed using different Machine Learning (ML) models. Federated learning (FL) has been acknowledged as a privacy-preserving machine learning technology, where multiple parties cooperatively train ML models without exchanging raw data. However, the current FL architecture does not allow for an audit of the training process due to the various data-protection policies implemented by each FL participant. Furthermore, there is no global model verifiability available in the current architecture. This paper proposes a smart contract-based policy control for securing the Federated Learning (FL) management system. First, we develop and deploy a smart contract-based local training policy control on the FL participants' side. This policy control is used to verify the training process, ensuring that the evaluation process follows the same rules for all FL participants. We then enforce a smart contract-based aggregation policy to manage the global model aggregation process. Upon completion, the aggregated model and policy are stored on blockchain-based storage. Subsequently, we distribute the aggregated global model and the smart contract to all FL participants. Our proposed method uses smart policy control to manage access and verify the integrity of machine learning models. We conducted multiple experiments with various machine learning architectures and datasets to evaluate our proposed framework, such as MNIST and CIFAR-10.
CRFeb 12, 2022
Local Differential Privacy for Federated LearningM. A. P. Chamikara, Dongxi Liu, Seyit Camtepe et al.
Advanced adversarial attacks such as membership inference and model memorization can make federated learning (FL) vulnerable and potentially leak sensitive private data. Local differentially private (LDP) approaches are gaining more popularity due to stronger privacy notions and native support for data distribution compared to other differentially private (DP) solutions. However, DP approaches assume that the FL server (that aggregates the models) is honest (run the FL protocol honestly) or semi-honest (run the FL protocol honestly while also trying to learn as much information as possible). These assumptions make such approaches unrealistic and unreliable for real-world settings. Besides, in real-world industrial environments (e.g., healthcare), the distributed entities (e.g., hospitals) are already composed of locally running machine learning models (this setting is also referred to as the cross-silo setting). Existing approaches do not provide a scalable mechanism for privacy-preserving FL to be utilized under such settings, potentially with untrusted parties. This paper proposes a new local differentially private FL (named LDPFL) protocol for industrial settings. LDPFL can run in industrial settings with untrusted entities while enforcing stronger privacy guarantees than existing approaches. LDPFL shows high FL model performance (up to 98%) under small privacy budgets (e.g., epsilon = 0.5) in comparison to existing methods.
CRNov 22, 2021
Chaos and Logistic Map based Key Generation Technique for AES-driven IoT SecurityZiaur Rahman, Xun Yi, Ibrahim Khalil et al.
Several efforts have been seen claiming the lightweight block ciphers as a necessarily suitable substitute in securing the Internet of Things. Currently, it has been able to envisage as a pervasive frame of reference almost all across the privacy preserving of smart and sensor-oriented appliances. Different approaches are likely to be inefficient, bringing desired degree of security considering the easiness and surely the process of simplicity but security. Strengthening the well-known symmetric key and block dependent algorithm using either chaos motivated logistic map or elliptic curve has shown a far-reaching potential to be a discretion in secure real-time communication. The popular feature of logistic maps, such as the un-foreseeability and randomness often expected to be used in dynamic key-propagation in sync with chaos and scheduling technique towards data integrity. As a bit alternation in keys, able to come up with oversize deviation, also would have consequence to leverage data confidentiality. Henceforth it may have proximity to time consumption, which may lead to a challenge to make sure instant data exchange between participating node entities. In consideration of delay latency required to both secure encryption and decryption, the proposed approach suggests a modification on the key-origination matrix along with S-box. It has plausibly been taken us to this point that the time required proportionate to the plain-text sent while the plain-text disproportionate to the probability happening a letter on the message made. In line with that the effort so far sought how apparent chaos escalates the desired key-initiation before message transmission.
CRNov 22, 2021
Blockchain for IoT: A Critical Analysis Concerning Performance and ScalabilityZiaur Rahman, Xun Yi, Ibrahim Khalil et al.
The world has been experiencing a mind-blowing expansion of blockchain technology since it was first introduced as an emerging means of cryptocurrency called bitcoin. Currently, it has been regarded as a pervasive frame of reference across almost all research domains, ranging from virtual cash to agriculture or even supply-chain to the Internet of Things. The ability to have a self-administering register with legitimate immutability makes blockchain appealing for the Internet of Things (IoT). As billions of IoT devices are now online in distributed fashion, the huge challenges and questions require to addressed in pursuit of urgently needed solutions. The present paper has been motivated by the aim of facilitating such efforts. The contribution of this work is to figure out those trade-offs the IoT ecosystem usually encounters because of the wrong choice of blockchain technology. Unlike a survey or review, the critical findings of this paper target sorting out specific security challenges of blockchain-IoT Infrastructure. The contribution includes how to direct developers and researchers in this domain to pick out the unblemished combinations of Blockchain enabled IoT applications. In addition, the paper promises to bring a deep insight on Ethereum, Hyperledger blockchain and IOTA technology to show their limitations and prospects in terms of performance and scalability.
CRJun 24, 2021
Blockchain-based Security Framework for Critical Industry 4.0 Cyber-physical SystemZiaur Rahman, Ibrahim Khalil, Xun Yi et al.
There has been an intense concern for security alternatives because of the recent rise of cyber attacks, mainly targeting critical systems such as industry, medical, or energy ecosystem. Though the latest industry infrastructures largely depend on AI-driven maintenance, the prediction based on corrupted data undoubtedly results in loss of life and capital. Admittedly, an inadequate data-protection mechanism can readily challenge the security and reliability of the network. The shortcomings of the conventional cloud or trusted certificate-driven techniques have motivated us to exhibit a unique Blockchain-based framework for a secure and efficient industry 4.0 system. The demonstrated framework obviates the long-established certificate authority after enhancing the consortium Blockchain that reduces the data processing delay, and increases cost-effective throughput. Nonetheless, the distributed industry 4.0 security model entails cooperative trust than depending on a single party, which in essence indulges the costs and threat of the single point of failure. Therefore, multi-signature technique of the proposed framework accomplishes the multi-party authentication, which confirms its applicability for the real-time and collaborative cyber-physical system.
CRMar 8, 2021
Social Media Identity Deception Detection: A SurveyAhmed Alharbi, Hai Dong, Xun Yi et al.
Social media have been growing rapidly and become essential elements of many people's lives. Meanwhile, social media have also come to be a popular source for identity deception. Many social media identity deception cases have arisen over the past few years. Recent studies have been conducted to prevent and detect identity deception. This survey analyses various identity deception attacks, which can be categorized into fake profile, identity theft and identity cloning. This survey provides a detailed review of social media identity deception detection techniques. It also identifies primary research challenges and issues in the existing detection techniques. This article is expected to benefit both researchers and social media providers.
CRJul 4, 2020
PPaaS: Privacy Preservation as a ServicePathum Chamikara Mahawaga Arachchige, Peter Bertok, Ibrahim Khalil et al.
Personally identifiable information (PII) can find its way into cyberspace through various channels, and many potential sources can leak such information. Data sharing (e.g. cross-agency data sharing) for machine learning and analytics is one of the important components in data science. However, due to privacy concerns, data should be enforced with strong privacy guarantees before sharing. Different privacy-preserving approaches were developed for privacy preserving data sharing; however, identifying the best privacy-preservation approach for the privacy-preservation of a certain dataset is still a challenge. Different parameters can influence the efficacy of the process, such as the characteristics of the input dataset, the strength of the privacy-preservation approach, and the expected level of utility of the resulting dataset (on the corresponding data mining application such as classification). This paper presents a framework named \underline{P}rivacy \underline{P}reservation \underline{a}s \underline{a} \underline{S}ervice (PPaaS) to reduce this complexity. The proposed method employs selective privacy preservation via data perturbation and looks at different dynamics that can influence the quality of the privacy preservation of a dataset. PPaaS includes pools of data perturbation methods, and for each application and the input dataset, PPaaS selects the most suitable data perturbation approach after rigorous evaluation. It enhances the usability of privacy-preserving methods within its pool; it is a generic platform that can be used to sanitize big data in a granular, application-specific manner by employing a suitable combination of diverse privacy-preserving algorithms to provide a proper balance between privacy and utility.
CRNov 1, 2019
IoTSign: Protecting Privacy and Authenticity of IoT using Discrete Cosine Based SteganographySharif Abuadbba, Ayman Ibaida, Ibrahim Khalil
Remotely generated data by Intent of Things (IoT) has recently had a lot of attention for their huge benefits such as efficient monitoring and risk reduction. The transmitted streams usually consist of periodical streams (e.g. activities) and highly private information (e.g. IDs). Despite the obvious benefits, the concerns are the secrecy and the originality of the transferred data. Surprisingly, although these concerns have been well studied for static data, they have received only limited attention for streaming data. Therefore, this paper introduces a new steganographic mechanism that provides (1) robust privacy protection of secret information by concealing them arbitrarily in the transported readings employing a random key, and (2) permanent proof of originality for the normal streams. This model surpasses our previous works by employing the Discrete Cosine Transform to expand the hiding capacity and reduce complexity. The resultant distortion has been accurately measured at all stages - the original, the stego, and the recovered forms - using a well-known measurement matrix called Percentage Residual Difference (PRD). After thorough experiments on three types of streams (i.e. chemical, environmental and smart homes), it has been proven that the original streams have not been affected (< 1 %). Also, the mathematical analysis shows that the model has much lighter (i.e. linear) computational complexity O(n) compared to existing work.