Md. Mostafizer Rahman

CL
3papers
74citations
Novelty38%
AI Score37

3 Papers

CLSep 26, 2023
Program Repair with Minimal Edits Using CodeT5

Atsushi Shirafuji, Md. Mostafizer Rahman, Md Faizul Ibne Amin et al.

Programmers often struggle to identify and fix bugs in their programs. In recent years, many language models (LMs) have been proposed to fix erroneous programs and support error recovery. However, the LMs tend to generate solutions that differ from the original input programs. This leads to potential comprehension difficulties for users. In this paper, we propose an approach to suggest a correct program with minimal repair edits using CodeT5. We fine-tune a pre-trained CodeT5 on code pairs of wrong and correct programs and evaluate its performance with several baseline models. The experimental results show that the fine-tuned CodeT5 achieves a pass@100 of 91.95% and an average edit distance of the most similar correct program of 6.84, which indicates that at least one correct program can be suggested by generating 100 candidate programs. We demonstrate the effectiveness of LMs in suggesting program repair with minimal edits for solving introductory programming problems.

23.5SEMar 26
Error Understanding in Program Code With LLM-DL for Multi-label Classification

Md Faizul Ibne Amin, Yutaka Watanobe, Md. Mostafizer Rahman et al.

Programming is a core skill in computer science and software engineering (SE), yet identifying and resolving code errors remains challenging for both novice and experienced developers. While Large Language Models (LLMs) have shown remarkable capabilities in natural language understanding and generation tasks, their potential in domain-specific, complex scenarios, such as multi-label classification (MLC) of programming errors, remains underexplored. Recognizing this less-explored area, this study proposes a multi-label error classification (MLEC) framework for source code that leverages fine-tuned LLMs, including CodeT5-base, GraphCodeBERT, CodeT5+, UniXcoder, RoBERTa, PLBART, and CoTexT. These LLMs are integrated with deep learning (DL) architectures such as GRU, LSTM, BiLSTM, and BiLSTM with an additive attention mechanism (BiLSTM-A) to capture both syntactic and semantic features from a real-world student-written Python code error dataset. Extensive experiments across 32 model variants, optimized using Optuna-based hyperparameter tuning, have been evaluated using comprehensive multi-label metrics, including average accuracy, macro and weighted precision, recall, F1-score, exact match accuracy, One-error, Hamming loss, Jaccard similarity, and ROC-AUC (micro, macro, and weighted). Results show that the CodeT5+\_GRU model achieved the strongest performance, with a weighted F1-score of 0.8243, average accuracy of 91.84\%, exact match accuracy of 53.78\%, Hamming loss of 0.0816, and One error of 0.0708. These findings confirm the effectiveness of combining pretrained semantic encoders with efficient recurrent decoders. This work lays the foundation for developing intelligent, scalable tools for automated code feedback, with potential applications in programming education (PE) and broader SE domains.

CLJun 1, 2024
RoBERTa-BiLSTM: A Context-Aware Hybrid Model for Sentiment Analysis

Md. Mostafizer Rahman, Ariful Islam Shiplu, Yutaka Watanobe et al.

Effectively analyzing the comments to uncover latent intentions holds immense value in making strategic decisions across various domains. However, several challenges hinder the process of sentiment analysis including the lexical diversity exhibited in comments, the presence of long dependencies within the text, encountering unknown symbols and words, and dealing with imbalanced datasets. Moreover, existing sentiment analysis tasks mostly leveraged sequential models to encode the long dependent texts and it requires longer execution time as it processes the text sequentially. In contrast, the Transformer requires less execution time due to its parallel processing nature. In this work, we introduce a novel hybrid deep learning model, RoBERTa-BiLSTM, which combines the Robustly Optimized BERT Pretraining Approach (RoBERTa) with Bidirectional Long Short-Term Memory (BiLSTM) networks. RoBERTa is utilized to generate meaningful word embedding vectors, while BiLSTM effectively captures the contextual semantics of long-dependent texts. The RoBERTa-BiLSTM hybrid model leverages the strengths of both sequential and Transformer models to enhance performance in sentiment analysis. We conducted experiments using datasets from IMDb, Twitter US Airline, and Sentiment140 to evaluate the proposed model against existing state-of-the-art methods. Our experimental findings demonstrate that the RoBERTa-BiLSTM model surpasses baseline models (e.g., BERT, RoBERTa-base, RoBERTa-GRU, and RoBERTa-LSTM), achieving accuracies of 80.74%, 92.36%, and 82.25% on the Twitter US Airline, IMDb, and Sentiment140 datasets, respectively. Additionally, the model achieves F1-scores of 80.73%, 92.35%, and 82.25% on the same datasets, respectively.