Mohaiminul Al Nahian

CR
h-index20
5papers
18citations
Novelty72%
AI Score51

5 Papers

97.5CRApr 27Code
CacheTrap: Unveiling a Stealthier Gray-Box Trojan against LLMs

Mohaiminul Al Nahian, Abeer Matar A. Almalky, Gamana Aragonda et al.

The rapid advancement of large language models (LLMs) has sparked growing interest in understanding their security vulnerabilities, particularly Trojan attacks that enable stealthy manipulation of model behavior. Traditional Trojan methods typically alter inputs and/or model weights, relying on white-box assumptions that require access to data or model internal parameters. In this work, we present CacheTrap, the first gray-box Trojan attack targeting the Key-Value (KV) cache of LLMs. This method induces a single-bit flip in the KV cache, serving as a transient trigger. When activated, this trigger causes the model to exhibit targeted actions without changing inputs or model weights. CacheTrap introduces an efficient search algorithm to locate vulnerable positions in the KV cache, independent of model weights or datasets. Extensive experiments on five open-source LLMs show a remarkable 100% attack success rate (with the trigger) while preserving benign accuracy (without the trigger) by flipping just one bit in the KV cache.

79.3CRApr 27Code
Invisible Hands: Gray-Box Bit Flip Attack for Steering LLMs Without Knowledge of Gradients, Data, and Weights

Abeer Matar A. Almalky, Ziyan Wang, Mohaiminul Al Nahian et al.

In recent years, large language models (LLMs) have achieved remarkable advances and are increasingly deployed in critical applications across diverse domains. This growing adoption raises urgent concerns about their security and robustness. In this work, we investigate the impact of Bit Flip Attacks (BFAs) on LLMs, which exploit hardware faults to corrupt model parameters, thereby threatening model integrity and performance. Existing BFA studies primarily assume a white-box setting with access to exact model weights and part of the dataset, and rely on progressive gradient-based bit-search strategies to identify vulnerable bits in model weights. However, gradient computation for LLMs is computationally expensive and memory intensive. In addition, assuming access to exact victim model weights and datasets is challenging due to increasingly strict user privacy regulations. To address these challenges, we propose the first gray-box BFA framework for LLMs, Invisible Hands, designed for efficient and practical deployment. Our method, Gradient-Data-Free-BFA, identifies vulnerable weight bits without requiring knowledge of model weights, gradients, or sample data. It introduces novel vulnerability index metrics that estimate the weights of susceptibility based solely on model architecture (Grey-Box). By eliminating data access and gradient computation, our approach significantly reduces memory overhead and scales efficiently across tasks with constant complexity. Experiments on six open-source LLMs demonstrate that adversarial objectives can be achieved with minimal weight perturbations, highlighting the effectiveness and practicality of Invisible Hands.

CVMay 27, 2025
Unified Alignment Protocol: Making Sense of the Unlabeled Data in New Domains

Sabbir Ahmed, Mamshad Nayeem Rizve, Abdullah Al Arafat et al.

Semi-Supervised Federated Learning (SSFL) is gaining popularity over conventional Federated Learning in many real-world applications. Due to the practical limitation of limited labeled data on the client side, SSFL considers that participating clients train with unlabeled data, and only the central server has the necessary resources to access limited labeled data, making it an ideal fit for real-world applications (e.g., healthcare). However, traditional SSFL assumes that the data distributions in the training phase and testing phase are the same. In practice, however, domain shifts frequently occur, making it essential for SSFL to incorporate generalization capabilities and enhance their practicality. The core challenge is improving model generalization to new, unseen domains while the client participate in SSFL. However, the decentralized setup of SSFL and unsupervised client training necessitates innovation to achieve improved generalization across domains. To achieve this, we propose a novel framework called the Unified Alignment Protocol (UAP), which consists of an alternating two-stage training process. The first stage involves training the server model to learn and align the features with a parametric distribution, which is subsequently communicated to clients without additional communication overhead. The second stage proposes a novel training algorithm that utilizes the server feature distribution to align client features accordingly. Our extensive experiments on standard domain generalization benchmark datasets across multiple model architectures reveal that proposed UAP successfully achieves SOTA generalization performance in SSFL setting.

ROApr 23, 2025
Robo-Troj: Attacking LLM-based Task Planners

Mohaiminul Al Nahian, Zainab Altaweel, David Reitano et al.

Robots need task planning methods to achieve goals that require more than individual actions. Recently, large language models (LLMs) have demonstrated impressive performance in task planning. LLMs can generate a step-by-step solution using a description of actions and the goal. Despite the successes in LLM-based task planning, there is limited research studying the security aspects of those systems. In this paper, we develop Robo-Troj, the first multi-trigger backdoor attack for LLM-based task planners, which is the main contribution of this work. As a multi-trigger attack, Robo-Troj is trained to accommodate the diversity of robot application domains. For instance, one can use unique trigger words, e.g., "herical", to activate a specific malicious behavior, e.g., cutting hand on a kitchen robot. In addition, we develop an optimization method for selecting the trigger words that are most effective. Through demonstrating the vulnerability of LLM-based planners, we aim to promote the development of secured robot systems.

CVAug 23, 2017
CNN-Based Prediction of Frame-Level Shot Importance for Video Summarization

Mohaiminul Al Nahian, A. S. M. Iftekhar, Mohammad Tariqul Islam et al.

In the Internet, ubiquitous presence of redundant, unedited, raw videos has made video summarization an important problem. Traditional methods of video summarization employ a heuristic set of hand-crafted features, which in many cases fail to capture subtle abstraction of a scene. This paper presents a deep learning method that maps the context of a video to the importance of a scene similar to that is perceived by humans. In particular, a convolutional neural network (CNN)-based architecture is proposed to mimic the frame-level shot importance for user-oriented video summarization. The weights and biases of the CNN are trained extensively through off-line processing, so that it can provide the importance of a frame of an unseen video almost instantaneously. Experiments on estimating the shot importance is carried out using the publicly available database TVSum50. It is shown that the performance of the proposed network is substantially better than that of commonly referred feature-based methods for estimating the shot importance in terms of mean absolute error, absolute error variance, and relative F-measure.