Fehmi Jaafar

DC
h-index8
8papers
39citations
Novelty39%
AI Score31

8 Papers

DCSep 25, 2023
SPIRT: A Fault-Tolerant and Reliable Peer-to-Peer Serverless ML Training Architecture

Amine Barrak, Mayssa Jaziri, Ranim Trabelsi et al.

The advent of serverless computing has ushered in notable advancements in distributed machine learning, particularly within parameter server-based architectures. Yet, the integration of serverless features within peer-to-peer (P2P) distributed networks remains largely uncharted. In this paper, we introduce SPIRT, a fault-tolerant, reliable, and secure serverless P2P ML training architecture. designed to bridge this existing gap. Capitalizing on the inherent robustness and reliability innate to P2P systems, SPIRT employs RedisAI for in-database operations, leading to an 82\% reduction in the time required for model updates and gradient averaging across a variety of models and batch sizes. This architecture showcases resilience against peer failures and adeptly manages the integration of new peers, thereby highlighting its fault-tolerant characteristics and scalability. Furthermore, SPIRT ensures secure communication between peers, enhancing the reliability of distributed machine learning tasks. Even in the face of Byzantine attacks, the system's robust aggregation algorithms maintain high levels of accuracy. These findings illuminate the promising potential of serverless architectures in P2P distributed machine learning, offering a significant stride towards the development of more efficient, scalable, and resilient applications.

DCSep 25, 2023
Exploring the Impact of Serverless Computing on Peer To Peer Training Machine Learning

Amine Barrak, Ranim Trabelsi, Fehmi Jaafar et al.

The increasing demand for computational power in big data and machine learning has driven the development of distributed training methodologies. Among these, peer-to-peer (P2P) networks provide advantages such as enhanced scalability and fault tolerance. However, they also encounter challenges related to resource consumption, costs, and communication overhead as the number of participating peers grows. In this paper, we introduce a novel architecture that combines serverless computing with P2P networks for distributed training and present a method for efficient parallel gradient computation under resource constraints. Our findings show a significant enhancement in gradient computation time, with up to a 97.34\% improvement compared to conventional P2P distributed training methods. As for costs, our examination confirmed that the serverless architecture could incur higher expenses, reaching up to 5.4 times more than instance-based architectures. It is essential to consider that these higher costs are associated with marked improvements in computation time, particularly under resource-constrained scenarios. Despite the cost-time trade-off, the serverless approach still holds promise due to its pay-as-you-go model. Utilizing dynamic resource allocation, it enables faster training times and optimized resource utilization, making it a promising candidate for a wide range of machine learning applications.

DCFeb 27, 2023
Architecting Peer-to-Peer Serverless Distributed Machine Learning Training for Improved Fault Tolerance

Amine Barrak, Fabio Petrillo, Fehmi Jaafar

Distributed Machine Learning refers to the practice of training a model on multiple computers or devices that can be called nodes. Additionally, serverless computing is a new paradigm for cloud computing that uses functions as a computational unit. Serverless computing can be effective for distributed learning systems by enabling automated resource scaling, less manual intervention, and cost reduction. By distributing the workload, distributed machine learning can speed up the training process and allow more complex models to be trained. Several topologies of distributed machine learning have been established (centralized, parameter server, peer-to-peer). However, the parameter server architecture may have limitations in terms of fault tolerance, including a single point of failure and complex recovery processes. Moreover, training machine learning in a peer-to-peer (P2P) architecture can offer benefits in terms of fault tolerance by eliminating the single point of failure. In a P2P architecture, each node or worker can act as both a server and a client, which allows for more decentralized decision making and eliminates the need for a central coordinator. In this position paper, we propose exploring the use of serverless computing in distributed machine learning training and comparing the performance of P2P architecture with the parameter server architecture, focusing on cost reduction and fault tolerance.

CRApr 19, 2025
A Data-Centric Approach for Safe and Secure Large Language Models against Threatening and Toxic Content

Chaima Njeh, Haïfa Nakouri, Fehmi Jaafar

Large Language Models (LLM) have made remarkable progress, but concerns about potential biases and harmful content persist. To address these apprehensions, we introduce a practical solution for ensuring LLM's safe and ethical use. Our novel approach focuses on a post-generation correction mechanism, the BART-Corrective Model, which adjusts generated content to ensure safety and security. Unlike relying solely on model fine-tuning or prompt engineering, our method provides a robust data-centric alternative for mitigating harmful content. We demonstrate the effectiveness of our approach through experiments on multiple toxic datasets, which show a significant reduction in mean toxicity and jail-breaking scores after integration. Specifically, our results show a reduction of 15% and 21% in mean toxicity and jail-breaking scores with GPT-4, a substantial reduction of 28% and 5% with PaLM2, a reduction of approximately 26% and 23% with Mistral-7B, and a reduction of 11.1% and 19% with Gemma-2b-it. These results demonstrate the potential of our approach to improve the safety and security of LLM, making them more suitable for real-world applications.

AIOct 14, 2025
Towards Robust Artificial Intelligence: Self-Supervised Learning Approach for Out-of-Distribution Detection

Wissam Salhab, Darine Ameyed, Hamid Mcheick et al.

Robustness in AI systems refers to their ability to maintain reliable and accurate performance under various conditions, including out-of-distribution (OOD) samples, adversarial attacks, and environmental changes. This is crucial in safety-critical systems, such as autonomous vehicles, transportation, or healthcare, where malfunctions could have severe consequences. This paper proposes an approach to improve OOD detection without the need of labeled data, thereby increasing the AI systems' robustness. The proposed approach leverages the principles of self-supervised learning, allowing the model to learn useful representations from unlabeled data. Combined with graph-theoretical techniques, this enables the more efficient identification and categorization of OOD samples. Compared to existing state-of-the-art methods, this approach achieved an Area Under the Receiver Operating Characteristic Curve (AUROC) = 0.99.

SEMar 31, 2021
Investigating Design Anti-pattern and Design Pattern Mutations and Their Change- and Fault-proneness

Zeinab, Kermansaravi, Md Saidur Rahman et al.

During software evolution, inexperienced developers may introduce design anti-patterns when they modify their software systems to fix bugs or to add new functionalities based on changes in requirements. Developers may also use design patterns to promote software quality or as a possible cure for some design anti-patterns. Thus, design patterns and design anti-patterns are introduced, removed, and mutated from one another by developers. Many studies investigated the evolution of design patterns and design anti-patterns and their impact on software development. However, they investigated design patterns or design anti-patterns in isolation and did not consider their mutations and the impact of these mutations on software quality. Therefore, we report our study of bidirectional mutations between design patterns and design anti-patterns and the impacts of these mutations on software change- and fault-proneness. We analyzed snapshots of seven Java software systems with diverse sizes, evolution histories, and application domains. We built Markov models to capture the probability of occurrences of the different design patterns and design anti-patterns mutations. Results from our study show that (1) design patterns and design anti-patterns mutate into other design patterns and/or design anti-patterns. They also show that (2) some change types primarily trigger mutations of design patterns and design anti-patterns (renaming and changes to comments, declarations, and operators), and (3) some mutations of design anti-patterns and design patterns are more faulty in specific contexts. These results provide important insights into the evolution of design patterns and design anti-patterns and its impact on the change- and fault-proneness of software systems.

SEApr 27, 2020
Internet of Things Architectures: A Comparative Study

Marcela G. dos Santos, Darine Ameyed, Fabio Petrillo et al.

Over the past two decades, the Internet of Things (IoT) has become an underlying concept to a variety of solutions and technologies that it is now hardly possible to enumerate and describe all of them. The concept behind the Internet of Things is as powerful as it is complex, and for the components in the IoT solution tomesh together perfectly, they all have to be part of a well-thought-out structure. That is where understanding the IoT architecture becomes paramount. Because of the vast domain of IoT, there is no single consensus on IoT architecture. Different researchers and organizations proposed different architectures under a variety of classifications, mainly: conceptual, standard and, industrial or commercial adoption. It is indispensable to make a systematic analysis of IoT architecture to be able to compare the industrial proposals and identify their similarities and their differences. In this work, we summarize information about seven IoT industrial architectures in order to propose an approach that makes possible a comparative analysis between different IoT architectures. This work presents two main contributions: (i) an approach for analyzing and comparing IoTarchitectures using Layer-Model; (ii) a comparative study of seven industrial IoT architectures.

CRDec 5, 2017
A Slow Read attack Using Cloud

Darine Ameyed, Fehmi Jaafar, Jaouhar Fattahi

Cloud computing relies on sharing computing resources rather than having local servers or personal devices to handle applications. Nowadays, cloud computing has become one of the fastest growing fields in information technology. However, several new security issues of cloud computing have emerged due to its service delivery models. In this paper, we discuss the case of distributed denial-of-service (DDoS) attack using Cloud resources. First, we show how such attack using a cloud platform could not be detected by previous techniques. Then we present a tricky solution based on the cloud as well.