CLNov 10, 2025

Retriv at BLP-2025 Task 2: Test-Driven Feedback-Guided Framework for Bangla-to-Python Code Generation

K M Nafi Asib, Sourav Saha, Mohammed Moshiul Hoque

arXiv:2511.07382v14.91 citationsh-index: 31

Originality Incremental advance

AI Analysis

This addresses the problem of underrepresented low-resource languages like Bangla in automated code generation, though it is incremental as it adapts existing methods to a specific domain.

The authors tackled code generation from Bangla natural language instructions by proposing a test-driven, feedback-guided iterative refinement method using a fine-tuned Qwen2.5-14B model, achieving a Pass@1 score of 0.934 and securing 2nd place in a shared task.

Large Language Models (LLMs) have advanced the automated generation of code from natural language prompts. However, low-resource languages (LRLs) like Bangla remain underrepresented due to the limited availability of instruction-to-code datasets and evaluation benchmarks. To address this, the BLP Workshop at IJCNLP-AACL 2025 introduced a shared task on "Code Generation in Bangla". In this work, we propose a method that combines instruction prompting with a test-driven, feedback-guided iterative refinement process using a fine-tuned Qwen2.5-14B model. The model generates code from Bangla instructions, tests it against unit tests, and iteratively refines any failing outputs through three evaluation passes, using test feedback to guide each step. This approach helped our team "Retriv" to secure 2nd place in the shared task with a Pass@1 score of 0.934. The analysis highlights challenges in Bangla instruction understanding and Python code generation, emphasizing the need for targeted methods in LRLs. We made experimental scripts publicly available for the community.

View on arXiv PDF

Similar