CRMar 7, 2023
A Survey on Explainable Artificial Intelligence for CybersecurityGaith Rjoub, Jamal Bentahar, Omar Abdel Wahab et al.
The black-box nature of artificial intelligence (AI) models has been the source of many concerns in their use for critical applications. Explainable Artificial Intelligence (XAI) is a rapidly growing research field that aims to create machine learning models that can provide clear and interpretable explanations for their decisions and actions. In the field of network cybersecurity, XAI has the potential to revolutionize the way we approach network security by enabling us to better understand the behavior of cyber threats and to design more effective defenses. In this survey, we review the state of the art in XAI for cybersecurity in network systems and explore the various approaches that have been proposed to address this important problem. The review follows a systematic classification of network-driven cybersecurity threats and issues. We discuss the challenges and limitations of current XAI methods in the context of cybersecurity and outline promising directions for future research.
CYApr 18, 2023
The Metaverse: Survey, Trends, Novel Pipeline Ecosystem & Future DirectionsHani Sami, Ahmad Hammoud, Mouhamad Arafeh et al.
The Metaverse offers a second world beyond reality, where boundaries are non-existent, and possibilities are endless through engagement and immersive experiences using the virtual reality (VR) technology. Many disciplines can benefit from the advancement of the Metaverse when accurately developed, including the fields of technology, gaming, education, art, and culture. Nevertheless, developing the Metaverse environment to its full potential is an ambiguous task that needs proper guidance and directions. Existing surveys on the Metaverse focus only on a specific aspect and discipline of the Metaverse and lack a holistic view of the entire process. To this end, a more holistic, multi-disciplinary, in-depth, and academic and industry-oriented review is required to provide a thorough study of the Metaverse development pipeline. To address these issues, we present in this survey a novel multi-layered pipeline ecosystem composed of (1) the Metaverse computing, networking, communications and hardware infrastructure, (2) environment digitization, and (3) user interactions. For every layer, we discuss the components that detail the steps of its development. Also, for each of these components, we examine the impact of a set of enabling technologies and empowering domains (e.g., Artificial Intelligence, Security & Privacy, Blockchain, Business, Ethics, and Social) on its advancement. In addition, we explain the importance of these technologies to support decentralization, interoperability, user experiences, interactions, and monetization. Our presented study highlights the existing challenges for each component, followed by research directions and potential solutions. To the best of our knowledge, this survey is the most comprehensive and allows users, scholars, and entrepreneurs to get an in-depth understanding of the Metaverse ecosystem to find their opportunities and potentials for contribution.
LGOct 31, 2022
FedMint: Intelligent Bilateral Client Selection in Federated Learning with Newcomer IoT DevicesOsama Wehbi, Sarhad Arisdakessian, Omar Abdel Wahab et al.
Federated Learning (FL) is a novel distributed privacy-preserving learning paradigm, which enables the collaboration among several participants (e.g., Internet of Things devices) for the training of machine learning models. However, selecting the participants that would contribute to this collaborative training is highly challenging. Adopting a random selection strategy would entail substantial problems due to the heterogeneity in terms of data quality, and computational and communication resources across the participants. Although several approaches have been proposed in the literature to overcome the problem of random selection, most of these approaches follow a unilateral selection strategy. In fact, they base their selection strategy on only the federated server's side, while overlooking the interests of the client devices in the process. To overcome this problem, we present in this paper FedMint, an intelligent client selection approach for federated learning on IoT devices using game theory and bootstrapping mechanism. Our solution involves the design of: (1) preference functions for the client IoT devices and federated servers to allow them to rank each other according to several factors such as accuracy and price, (2) intelligent matching algorithms that take into account the preferences of both parties in their design, and (3) bootstrapping technique that capitalizes on the collaboration of multiple federated servers in order to assign initial accuracy value for the newly connected IoT devices. Based on our simulation findings, our strategy surpasses the VanillaFL selection approach in terms of maximizing both the revenues of the client devices and accuracy of the global federated learning model.
54.6LGMar 30
FL-PBM: Pre-Training Backdoor Mitigation for Federated LearningOsama Wehbi, Sarhad Arisdakessian, Omar Abdel Wahab et al.
Backdoor attacks pose a significant threat to the integrity and reliability of Artificial Intelligence (AI) models, enabling adversaries to manipulate model behavior by injecting poisoned data with hidden triggers. These attacks can lead to severe consequences, especially in critical applications such as autonomous driving, healthcare, and finance. Detecting and mitigating backdoor attacks is crucial across the lifespan of model's phases, including pre-training, in-training, and post-training. In this paper, we propose Pre-Training Backdoor Mitigation for Federated Learning (FL-PBM), a novel defense mechanism that proactively filters poisoned data on the client side before model training in a federated learning (FL) environment. The approach consists of three stages: (1) inserting a benign trigger into the data to establish a controlled baseline, (2) applying Principal Component Analysis (PCA) to extract discriminative features and assess the separability of the data, (3) performing Gaussian Mixture Model (GMM) clustering to identify potentially malicious data samples based on their distribution in the PCA-transformed space, and (4) applying a targeted blurring technique to disrupt potential backdoor triggers. Together, these steps ensure that suspicious data is detected early and sanitized effectively, thereby minimizing the influence of backdoor triggers on the global model. Experimental evaluations on image-based datasets demonstrate that FL-PBM reduces attack success rates by up to 95% compared to baseline federated learning (FedAvg) and by 30 to 80% relative to state-of-the-art defenses (RDFL and LPSF). At the same time, it maintains over 90% clean model accuracy in most experiments, achieving better mitigation without degrading model performance.
17.5LGMar 30
Mitigating Backdoor Attacks in Federated Learning Using PPA and MiniMax Game TheoryOsama Wehbi, Sarhad Arisdakessian, Omar Abdel Wahab et al.
Federated Learning (FL) is witnessing wider adoption due to its ability to benefit from large amounts of scattered data while preserving privacy. However, despite its advantages, federated learning suffers from several setbacks that directly impact the accuracy, and the integrity of the global model it produces. One of these setbacks is the presence of malicious clients who actively try to harm the global model by injecting backdoor data into their local models while trying to evade detection. The objective of such clients is to trick the global model into making false predictions during inference, thereby compromising the integrity and trustworthiness of the global model on which honest stakeholders rely. To mitigate such mischievous behavior, we propose FedBBA (Federated Backdoor and Behavior Analysis). The proposed model aims to dampen the effect of such clients on the final accuracy, creating more resilient federated learning environments. We engineer our approach through the combination of (1) a reputation system to evaluate and track client behavior, (2) an incentive mechanism to reward honest participation and penalize malicious behavior, and (3) game theoretical models with projection pursuit analysis (PPA) to dynamically identify and minimize the impact of malicious clients on the global model. Extensive simulations on the German Traffic Sign Recognition Benchmark (GTSRB) and Belgium Traffic Sign Classification (BTSC) datasets demonstrate that FedBBA reduces the backdoor attack success rate to approximately 1.1%--11% across various attack scenarios, significantly outperforming state-of-the-art defenses like RDFL and RoPE, which yielded attack success rates between 23% and 76%, while maintaining high normal task accuracy (~95%--98%).
GTMay 1, 2024
Enhancing Mutual Trustworthiness in Federated Learning for Data-Rich Smart CitiesOsama Wehbi, Sarhad Arisdakessian, Mohsen Guizani et al.
Federated learning is a promising collaborative and privacy-preserving machine learning approach in data-rich smart cities. Nevertheless, the inherent heterogeneity of these urban environments presents a significant challenge in selecting trustworthy clients for collaborative model training. The usage of traditional approaches, such as the random client selection technique, poses several threats to the system's integrity due to the possibility of malicious client selection. Primarily, the existing literature focuses on assessing the trustworthiness of clients, neglecting the crucial aspect of trust in federated servers. To bridge this gap, in this work, we propose a novel framework that addresses the mutual trustworthiness in federated learning by considering the trust needs of both the client and the server. Our approach entails: (1) Creating preference functions for servers and clients, allowing them to rank each other based on trust scores, (2) Establishing a reputation-based recommendation system leveraging multiple clients to assess newly connected servers, (3) Assigning credibility scores to recommending devices for better server trustworthiness measurement, (4) Developing a trust assessment mechanism for smart devices using a statistical Interquartile Range (IQR) method, (5) Designing intelligent matching algorithms considering the preferences of both parties. Based on simulation and experimental results, our approach outperforms baseline methods by increasing trust levels, global model accuracy, and reducing non-trustworthy clients in the system.
GTApr 26, 2021
Cloud computing as a platform for monetizing data services: A two-sided game business modelAhmed Saleh Bataineh, Jamal Bentahar, Rabeb Mizouni et al.
With the unprecedented reliance on cloud computing as the backbone for storing today's big data, we argue in this paper that the role of the cloud should be reshaped from being a passive virtual market to become an active platform for monetizing the big data through Artificial Intelligence (AI) services. The objective is to enable the cloud to be an active platform that can help big data service providers reach a wider set of customers and cloud users (i.e., data consumers) to be exposed to a larger and richer variety of data to run their data analytic tasks. To achieve this vision, we propose a novel game theoretical model, which consists of a mix of cooperative and competitive strategies. The players of the game are the big data service providers, cloud computing platform, and cloud users. The strategies of the players are modeled using the two-sided market theory that takes into consideration the network effects among involved parties, while integrating the externalities between the cloud resources and consumer demands into the design of the game. Simulations conducted using Amazon and google clustered data show that the proposed model improves the total surplus of all the involved parties in terms of cloud resources provision and monetary profits compared to the current merchant model.
IRJun 9, 2020
A two-level solution to fight against dishonest opinions in recommendation-based trust systemsOmar Abdel Wahab, Jamal Bentahar, Robin Cohen et al.
In this paper, we propose a mechanism to deal with dishonest opinions in recommendation-based trust models, at both the collection and processing levels. We consider a scenario in which an agent requests recommendations from multiple parties to build trust toward another agent. At the collection level, we propose to allow agents to self-assess the accuracy of their recommendations and autonomously decide on whether they would participate in the recommendation process or not. At the processing level, we propose a recommendations aggregation technique that is resilient to collusion attacks, followed by a credibility update mechanism for the participating agents. The originality of our work stems from its consideration of dishonest opinions at both the collection and processing levels, which allows for better and more persistent protection against dishonest recommenders. Experiments conducted on the Epinions dataset show that our solution yields better performance in protecting the recommendation process against Sybil attacks, in comparison with a competing model that derives the optimal network of advisors based on the agents' trust values.
LGMay 23, 2019
Generative Adversarial Networks for Mitigating Biases in Machine Learning SystemsAdel Abusitta, Esma Aïmeur, Omar Abdel Wahab
In this paper, we propose a new framework for mitigating biases in machine learning systems. The problem of the existing mitigation approaches is that they are model-oriented in the sense that they focus on tuning the training algorithms to produce fair results, while overlooking the fact that the training data can itself be the main reason for biased outcomes. Technically speaking, two essential limitations can be found in such model-based approaches: 1) the mitigation cannot be achieved without degrading the accuracy of the machine learning models, and 2) when the data used for training are largely biased, the training time automatically increases so as to find suitable learning parameters that help produce fair results. To address these shortcomings, we propose in this work a new framework that can largely mitigate the biases and discriminations in machine learning systems while at the same time enhancing the prediction accuracy of these systems. The proposed framework is based on conditional Generative Adversarial Networks (cGANs), which are used to generate new synthetic fair data with selective properties from the original data. We also propose a framework for analyzing data biases, which is important for understanding the amount and type of data that need to be synthetically sampled and labeled for each population group. Experimental results show that the proposed solution can efficiently mitigate different types of biases, while at the same time enhancing the prediction accuracy of the underlying machine learning model.