Kawser Wazed Nafi

CR
h-index48
11papers
145citations
Novelty45%
AI Score53

11 Papers

76.8CRMay 21
Semantic Attacks on Tool-Augmented LLMs: Securing the Model Context Protocol Against Descriptor-Level Manipulation

Saeid Jamshidi, Arghavan Moradi Dakhel, Kawser Wazed Nafi et al.

The Model Context Protocol (MCP) enables Large Language Models (LLMs) to interact with external tools via tool descriptors, thereby extending their capabilities for task execution, autonomous decision-making, and multi-agent coordination. Existing MCP deployments treat tool descriptors as trusted metadata, despite their direct integration into the LLM reasoning context. This introduces a previously underexplored semantic attack surface. Current defenses primarily target prompt injection, neglecting descriptor-level manipulation that can bias tool selection and downstream reasoning. To address this gap, we formalize three descriptor-driven attack classes: Tool Poisoning, Shadowing, and Rug Pull. We propose a layered defense solution that integrates descriptor integrity verification, pre-context semantic vetting with an auxiliary LLM, and lightweight runtime guardrails, without requiring model retraining. We evaluate GPT-5.3, DeepSeek-V3, and LLaMA-3.5 across eight prompting strategies in controlled, adversarial MCP scenarios in which tool metadata is manipulated to simulate realistic attacks. Results demonstrate that descriptor manipulation can substantially alter tool-selection behavior, producing unsafe tool invocations in up to 36% of trials under baseline configurations. The proposed full-stack mitigation reduces unsafe invocations to 15% while increasing the block rate to 74%, demonstrating substantial improvement in resistance to descriptor-driven attacks. Cross-model analysis further reveals significant differences in robustness, latency, and sensitivity to descriptor-level manipulation across LLM architectures and prompting strategies. This study provides a controlled cross-model evaluation of descriptor-level threats and mitigation strategies in tool-calling LLM systems, establishing an empirical foundation for deploying secure and resilient tool-augmented LLMs.

39.3CRMay 19
Carbon-Aware Intrusion Detection: A Comparative Study of Supervised and Unsupervised DRL for Sustainable IoT Edge Gateways

Saeid Jamshidi, Foutse Khomh, Kawser Wazed Nafi et al.

The rapid expansion of the Internet of Things (IoT) has intensified cybersecurity challenges, particularly in mitigating Distributed Denial-of-Service (DDoS) attacks at the network edge. Traditional Intrusion Detection Systems (IDSs) face significant limitations, including poor adaptability to evolving and zero-day attacks, reliance on static signatures and labeled datasets, and inefficiency on resource-constrained edge gateways. Moreover, most existing DRL-based IDS studies overlook sustainability factors such as energy efficiency and carbon impact. To address these challenges, this paper proposes two novel Deep Reinforcement Learning (DRL)-based IDS: DeepEdgeIDS, a label-free Autoencoder-DRL hybrid, and AutoDRL-IDS, a supervised LSTM--DRL model. Both DRL-based IDS are validated through theoretical analysis and experimental evaluation on edge gateways. Results demonstrate that AutoDRL-IDS achieves 94% detection accuracy using labeled data, while DeepEdgeIDS attains 98% offline evaluation accuracy through label-free anomaly detection and online mitigation feedback. This study introduces a carbon-aware, multi-objective reward formulation that supports supervised reward optimization for AutoDRL-IDS and label-free online reward learning for DeepEdgeIDS, enabling sustainable real-time IDS operation in dynamic IoT networks.

CRJan 30
Secure Tool Manifest and Digital Signing Solution for Verifiable MCP and LLM Pipelines

Saeid Jamshidi, Kawser Wazed Nafi, Arghavan Moradi Dakhel et al.

Large Language Models (LLMs) are increasingly adopted in sensitive domains such as healthcare and financial institutions' data analytics; however, their execution pipelines remain vulnerable to manipulation and unverifiable behavior. Existing control mechanisms, such as the Model Context Protocol (MCP), define compliance policies for tool invocation but lack verifiable enforcement and transparent validation of model actions. To address this gap, we propose a novel Secure Tool Manifest and Digital Signing Framework, a structured and security-aware extension of Model Context Protocols. The framework enforces cryptographically signed manifests, integrates transparent verification logs, and isolates model-internal execution metadata from user-visible components to ensure verifiable execution integrity. Furthermore, the evaluation demonstrates that the framework scales nearly linearly (R-squared = 0.998), achieves near-perfect acceptance of valid executions while consistently rejecting invalid ones, and maintains balanced model utilization across execution pipelines.

CLDec 2, 2025
The Moral Consistency Pipeline: Continuous Ethical Evaluation for Large Language Models

Saeid Jamshidi, Kawser Wazed Nafi, Arghavan Moradi Dakhel et al.

The rapid advancement and adaptability of Large Language Models (LLMs) highlight the need for moral consistency, the capacity to maintain ethically coherent reasoning across varied contexts. Existing alignment frameworks, structured approaches designed to align model behavior with human ethical and social norms, often rely on static datasets and post-hoc evaluations, offering limited insight into how ethical reasoning may evolve across different contexts or temporal scales. This study presents the Moral Consistency Pipeline (MoCoP), a dataset-free, closed-loop framework for continuously evaluating and interpreting the moral stability of LLMs. MoCoP combines three supporting layers: (i) lexical integrity analysis, (ii) semantic risk estimation, and (iii) reasoning-based judgment modeling within a self-sustaining architecture that autonomously generates, evaluates, and refines ethical scenarios without external supervision. Our empirical results on GPT-4-Turbo and DeepSeek suggest that MoCoP effectively captures longitudinal ethical behavior, revealing a strong inverse relationship between ethical and toxicity dimensions (correlation rET = -0.81, p value less than 0.001) and a near-zero association with response latency (correlation rEL approximately equal to 0). These findings demonstrate that moral coherence and linguistic safety tend to emerge as stable and interpretable characteristics of model behavior rather than short-term fluctuations. Furthermore, by reframing ethical evaluation as a dynamic, model-agnostic form of moral introspection, MoCoP offers a reproducible foundation for scalable, continuous auditing and advances the study of computational morality in autonomous AI systems.

CRJan 30
Tri-LLM Cooperative Federated Zero-Shot Intrusion Detection with Semantic Disagreement and Trust-Aware Aggregation

Saeid Jamshidi, Omar Abdul Wahab, Foutse Khomh et al.

Federated learning (FL) has become an effective paradigm for privacy-preserving, distributed Intrusion Detection Systems (IDS) in cyber-physical and Internet of Things (IoT) networks, where centralized data aggregation is often infeasible due to privacy and bandwidth constraints. Despite its advantages, most existing FL-based IDS assume closed-set learning and lack mechanisms such as uncertainty estimation, semantic generalization, and explicit modeling of epistemic ambiguity in zero-day attack scenarios. Additionally, robustness to heterogeneous and unreliable clients remains a challenge in practical applications. This paper introduces a semantics-driven federated IDS framework that incorporates language-derived semantic supervision into federated optimization, enabling open-set and zero-shot intrusion detection for previously unseen attack behaviors. The approach constructs semantic attack prototypes using a Tri-LLM ensemble of GPT-4o, DeepSeek-V3, and LLaMA-3-8B, aligning distributed telemetry features with high-level attack concepts. Inter-LLM semantic disagreement is modeled as epistemic uncertainty for zero-day risk estimation, while a trust-aware aggregation mechanism dynamically weights client updates based on reliability. Experimental results show stable semantic alignment across heterogeneous clients and consistent convergence. The framework achieves over 80% zero-shot detection accuracy on unseen attack patterns, improving zero-day discrimination by more than 10% compared to similarity-based baselines, while maintaining low aggregation instability in the presence of unreliable or compromised clients.

68.5CRMay 1
Self-Adaptive Multi-Agent LLM-Based Security Pattern Selection for IoT Systems

Saeid Jamshidi, Foutse Khomh, Carol Fung et al.

The adoption of Internet of Things (IoT) systems at the network edge of smart architectures is increasing rapidly, intensifying the need for security mechanisms that are both adaptive and resource-efficient. In such environments, runtime defence mechanisms are no longer limited to detection alone but become a resource-constrained task of selecting mitigation actions. Security controls must be carefully selected, combined, and executed under latency, energy, and computational constraints, while preventing unsafe interactions between controls. Existing approaches predominantly rely on static rule sets and learned policies, which provide limited guarantees of feasibility, conflict safety, and execution correctness in resource-constrained edge settings. To address this limitation, we introduce ASPO, a self-adaptive multi-agent security pattern selection that integrates Large Language Model (LLM)-based reasoning with deterministic enforcement within a MAPE-K control loop. ASPO explicitly separates stochastic decision generation from execution: LLM agents propose candidate mitigation portfolios, while a deterministic optimisation core enforces closed-world action integrity, conflict-free composition, and resource feasibility at every decision epoch. We deploy ASPO on a distributed edge-gateway testbed and evaluate it across two workloads, each comprising 500 and 1000 runtime security decisions, using replayed IoT attack traffic. In addition, the results demonstrate invariant safety properties, including 100% conflict-free activation, consistent resource feasibility across workloads, and stable pattern dominance with perfect rank preservation. Importantly, deeper decision exploration reduces extreme-case execution costs, compressing tail latency and energy overheads by 21.9% and 23.1%, respectively, without increasing mean energy consumption.

38.6AIApr 1
Adversarial Moral Stress Testing of Large Language Models

Saeid Jamshidi, Foutse Khomh, Arghavan Moradi Dakhel et al.

Evaluating the ethical robustness of large language models (LLMs) deployed in software systems remains challenging, particularly under sustained adversarial user interaction. Existing safety benchmarks typically rely on single-round evaluations and aggregate metrics, such as toxicity scores and refusal rates, which offer limited visibility into behavioral instability that may arise during realistic multi-turn interactions. As a result, rare but high-impact ethical failures and progressive degradation effects may remain undetected prior to deployment. This paper introduces Adversarial Moral Stress Testing (AMST), a stress-based evaluation framework for assessing ethical robustness under adversarial multi-round interactions. AMST applies structured stress transformations to prompts and evaluates model behavior through distribution-aware robustness metrics that capture variance, tail risk, and temporal behavioral drift across interaction rounds. We evaluate AMST on several state-of-the-art LLMs, including LLaMA-3-8B, GPT-4o, and DeepSeek-v3, using a large set of adversarial scenarios generated under controlled stress conditions. The results demonstrate substantial differences in robustness profiles across models and expose degradation patterns that are not observable under conventional single-round evaluation protocols. In particular, robustness has been shown to depend on distributional stability and tail behavior rather than on average performance alone. Additionally, AMST provides a scalable and model-agnostic stress-testing methodology that enables robustness-aware evaluation and monitoring of LLM-enabled software systems operating in adversarial environments.

DCApr 25, 2013
A New Trusted and E-Commerce Architecture for Cloud Computing

Kawser Wazed Nafi, Tonny Shekha Kar, Amjad Hossain et al.

Cloud computing platform gives people the opportunity for sharing resources, services and information among the people of the whole world. In private cloud system, information is shared among the persons who are in that cloud. Presently, different types of internet based systems are running in Cloud Computing environment. E-commerce is one of them. Present models are not secured enough for executing e-transactions easily, especially in cloud platform. Again, most of the time, clients fail to distinguish between the good online business companies and the bad one, which discourages clients and companies to migrate in cloud. In this paper, we have proposed a newer e-commerce architecture depends on encryption based secured and fuzzy logic based certain trust model which will be helpful to solve present e-commerce problems. We had discussed about the whole working procedure of the model in this paper. Finally, at the end of this paper, we have discussed some experimental results about our proposed model which will help to show the validity of our model.

AIApr 15, 2013
A Fuzzy Logic Based Certain Trust Model for E-Commerce

Kawser Wazed Nafi, Tonny Shekha Kar, Amjad Hossain et al.

Trustworthiness especially for service oriented system is very important topic now a day in IT field of the whole world. There are many successful E-commerce organizations presently run in the whole world, but E-commerce has not reached its full potential. The main reason behind this is lack of Trust of people in e-commerce. Again, proper models are still absent for calculating trust of different e-commerce organizations. Most of the present trust models are subjective and have failed to account vagueness and ambiguity of different domain. In this paper we have proposed a new fuzzy logic based Certain Trust model which considers these ambiguity and vagueness of different domain. Fuzzy Based Certain Trust Model depends on some certain values given by experts and developers. can be applied in a system like cloud computing, internet, website, e-commerce, etc. to ensure trustworthiness of these platforms. In this paper we show, although fuzzy works with uncertainties, proposed model works with some certain values. Some experimental results and validation of the model with linguistics terms are shown at the last part of the paper.

DCMar 4, 2013
A Newer User Authentication, File encryption and Distributed Server Based Cloud Computing Security Architecture

Kawser Wazed Nafi, Tonny Shekha Kar, Sayed Anisul Hoque et al.

The cloud computing platform gives people the opportunity for sharing resources, services and information among the people of the whole world. In private cloud system, information is shared among the persons who are in that cloud. For this, security or personal information hiding process hampers. In this paper we have proposed new security architecture for cloud computing platform. This ensures secure communication system and hiding information from others. AES based file encryption system and asynchronous key system for exchanging information or data is included in this model. This structure can be easily applied with main cloud computing features, e.g. PaaS, SaaS and IaaS. This model also includes onetime password system for user authentication process. Our work mainly deals with the security system of the whole cloud computing platform.

CRMar 3, 2013
An Advanced Certain Trust Model Using Fuzzy Logic and Probabilistic Logic theory

Kawser Wazed Nafi, Tonny Shekha kar, Amjad Hossain et al.

Trustworthiness especially for service oriented system is very important topic now a day in IT field of the whole world. Certain Trust Model depends on some certain values given by experts and developers. Here, main parameters for calculating trust are certainty and average rating. In this paper we have proposed an Extension of Certain Trust Model, mainly the representation portion based on probabilistic logic and fuzzy logic. This extended model can be applied in a system like cloud computing, internet, website, e-commerce, etc. to ensure trustworthiness of these platforms. The model uses the concept of fuzzy logic to add fuzziness with certainty and average rating to calculate the trustworthiness of a system more accurately. We have proposed two new parameters - trust T and behavioral probability P, which will help both the users and the developers of the system to understand its present condition easily. The linguistic variables are defined for both T and P and then these variables are implemented in our laboratory to verify the proposed trust model. We represent the trustworthiness of test system for two cases of evidence value using Fuzzy Associative Memory (FAM). We use inference rules and defuzzification method for verifying the model.