Muhammad Rafsan Kabir

CL
h-index17
7papers
22citations
Novelty51%
AI Score46

7 Papers

CLAug 20, 2024Code
Beyond Labels: Aligning Large Language Models with Human-like Reasoning

Muhammad Rafsan Kabir, Rafeed Mohammad Sultan, Ihsanul Haque Asif et al.

Aligning large language models (LLMs) with a human reasoning approach ensures that LLMs produce morally correct and human-like decisions. Ethical concerns are raised because current models are prone to generating false positives and providing malicious responses. To contribute to this issue, we have curated an ethics dataset named Dataset for Aligning Reasons (DFAR), designed to aid in aligning language models to generate human-like reasons. The dataset comprises statements with ethical-unethical labels and their corresponding reasons. In this study, we employed a unique and novel fine-tuning approach that utilizes ethics labels and their corresponding reasons (L+R), in contrast to the existing fine-tuning approach that only uses labels (L). The original pre-trained versions, the existing fine-tuned versions, and our proposed fine-tuned versions of LLMs were then evaluated on an ethical-unethical classification task and a reason-generation task. Our proposed fine-tuning strategy notably outperforms the others in both tasks, achieving significantly higher accuracy scores in the classification task and lower misalignment rates in the reason-generation task. The increase in classification accuracies and decrease in misalignment rates indicate that the L+R fine-tuned models align more with human ethics. Hence, this study illustrates that injecting reasons has substantially improved the alignment of LLMs, resulting in more human-like responses. We have made the DFAR dataset and corresponding codes publicly available at https://github.com/apurba-nsu-rnd-lab/DFAR.

CVDec 9, 2025
Beyond Real Weights: Hypercomplex Representations for Stable Quantization

Jawad Ibn Ahad, Maisha Rahman, Amrijit Biswas et al.

Multimodal language models (MLLMs) require large parameter capacity to align high-dimensional visual features with linguistic representations, making them computationally heavy and difficult to deploy efficiently. We introduce a progressive reparameterization strategy that compresses these models by gradually replacing dense feed-forward network blocks with compact Parameterized Hypercomplex Multiplication (PHM) layers. A residual interpolation schedule, together with lightweight reconstruction and knowledge distillation losses, ensures that the PHM modules inherit the functional behavior of their dense counterparts during training. This transition yields substantial parameter and FLOP reductions while preserving strong multimodal alignment, enabling faster inference without degrading output quality. We evaluate the approach on multiple vision-language models (VLMs). Our method maintains performance comparable to the base models while delivering significant reductions in model size and inference latency. Progressive PHM substitution thus offers an architecture-compatible path toward more efficient multimodal reasoning and complements existing low-bit quantization techniques.

CLNov 14, 2025
LAET: A Layer-wise Adaptive Ensemble Tuning Framework for Pretrained Language Models

Jawad Ibn Ahad, Muhammad Rafsan Kabir, Robin Krambroeckers et al.

Natural Language Processing (NLP) has transformed the financial industry, enabling advancements in areas such as textual analysis, risk management, and forecasting. Large language models (LLMs) like BloombergGPT and FinMA have set new benchmarks across various financial NLP tasks, including sentiment analysis, stock movement prediction, and credit risk assessment. Furthermore, FinMA-ES, a bilingual financial LLM, has also demonstrated strong performance using the FLARE and FLARE-ES benchmarks. However, the high computational demands of these models limit the accessibility of many organizations. To address this, we propose Layer-wise Adaptive Ensemble Tuning (LAET), a novel strategy that selectively fine-tunes the most effective layers of pre-trained LLMs by analyzing hidden state representations while freezing less critical layers. LAET significantly reduces computational overhead while enhancing task-specific performance. Our approach shows strong results in financial NLP tasks, outperforming existing benchmarks and state-of-the-art LLMs such as GPT-4, even with smaller LLMs ($\sim$3B parameters). This work bridges cutting-edge financial NLP research and real-world deployment with efficient and scalable models for financial applications.

LGAug 18, 2025Code
Z-Pruner: Post-Training Pruning of Large Language Models for Efficiency without Retraining

Samiul Basir Bhuiyan, Md. Sazzad Hossain Adib, Mohammed Aman Bhuiyan et al.

Large language models (LLMs) have rapidly advanced in recent years, achieving remarkable performance across a wide range of natural language processing tasks. However, this progress has come at the cost of increasingly large model sizes, which pose significant challenges for deployment, scalability, and energy efficiency. To address these limitations, post-training pruning has emerged as a promising approach for reducing model size and inference latency without the need for retraining. Despite these advantages, many existing pruning methods result in substantial performance degradation or require computationally expensive fine-tuning. In this work, we introduce Z-Pruner, a novel post-training pruning method designed to induce sparsity in pretrained LLMs without any retraining. Unlike conventional approaches, Z-Pruner leverages both weight update magnitudes and activation patterns to identify and eliminate redundant parameters more effectively. Our method is model-agnostic, efficient, and easy to implement. We evaluate Z-Pruner using multiple widely-used LLM architectures, including LLaMA-2, LLaMA-3, and OPT, across a diverse set of standard language benchmarks. Experimental results demonstrate that Z-Pruner surpasses state-of-the-art pruning methods that require intensive weight updates. Specifically, Z-Pruner achieves the lowest perplexity scores and the highest overall average score for zero-shot accuracy. We have made the corresponding codes publicly available at https://github.com/sazzadadib/Z-Pruner.

CLNov 22, 2024
BanglaEmbed: Efficient Sentence Embedding Models for a Low-Resource Language Using Cross-Lingual Distillation Techniques

Muhammad Rafsan Kabir, Md. Mohibur Rahman Nabil, Mohammad Ashrafuzzaman Khan

Sentence-level embedding is essential for various tasks that require understanding natural language. Many studies have explored such embeddings for high-resource languages like English. However, low-resource languages like Bengali (a language spoken by almost two hundred and thirty million people) are still under-explored. This work introduces two lightweight sentence transformers for the Bangla language, leveraging a novel cross-lingual knowledge distillation approach. This method distills knowledge from a pre-trained, high-performing English sentence transformer. Proposed models are evaluated across multiple downstream tasks, including paraphrase detection, semantic textual similarity (STS), and Bangla hate speech detection. The new method consistently outperformed existing Bangla sentence transformers. Moreover, the lightweight architecture and shorter inference time make the models highly suitable for deployment in resource-constrained environments, making them valuable for practical NLP applications in low-resource languages.

IRApr 19, 2025
LegalRAG: A Hybrid RAG System for Multilingual Legal Information Retrieval

Muhammad Rafsan Kabir, Rafeed Mohammad Sultan, Fuad Rahman et al.

Natural Language Processing (NLP) and computational linguistic techniques are increasingly being applied across various domains, yet their use in legal and regulatory tasks remains limited. To address this gap, we develop an efficient bilingual question-answering framework for regulatory documents, specifically the Bangladesh Police Gazettes, which contain both English and Bangla text. Our approach employs modern Retrieval Augmented Generation (RAG) pipelines to enhance information retrieval and response generation. In addition to conventional RAG pipelines, we propose an advanced RAG-based approach that improves retrieval performance, leading to more precise answers. This system enables efficient searching for specific government legal notices, making legal information more accessible. We evaluate both our proposed and conventional RAG systems on a diverse test set on Bangladesh Police Gazettes, demonstrating that our approach consistently outperforms existing methods across all evaluation metrics.

CVSep 8, 2025
Video-Based MPAA Rating Prediction: An Attention-Driven Hybrid Architecture Using Contrastive Learning

Dipta Neogi, Nourash Azmine Chowdhury, Muhammad Rafsan Kabir et al.

The rapid growth of visual content consumption across platforms necessitates automated video classification for age-suitability standards like the MPAA rating system (G, PG, PG-13, R). Traditional methods struggle with large labeled data requirements, poor generalization, and inefficient feature learning. To address these challenges, we employ contrastive learning for improved discrimination and adaptability, exploring three frameworks: Instance Discrimination, Contextual Contrastive Learning, and Multi-View Contrastive Learning. Our hybrid architecture integrates an LRCN (CNN+LSTM) backbone with a Bahdanau attention mechanism, achieving state-of-the-art performance in the Contextual Contrastive Learning framework, with 88% accuracy and an F1 score of 0.8815. By combining CNNs for spatial features, LSTMs for temporal modeling, and attention mechanisms for dynamic frame prioritization, the model excels in fine-grained borderline distinctions, such as differentiating PG-13 and R-rated content. We evaluate the model's performance across various contrastive loss functions, including NT-Xent, NT-logistic, and Margin Triplet, demonstrating the robustness of our proposed architecture. To ensure practical application, the model is deployed as a web application for real-time MPAA rating classification, offering an efficient solution for automated content compliance across streaming platforms.