CLNov 10, 2025

Retriv at BLP-2025 Task 2: Test-Driven Feedback-Guided Framework for Bangla-to-Python Code Generation

arXiv:2511.07382v11 citationsh-index: 31
Originality Incremental advance
AI Analysis

This addresses the problem of underrepresented low-resource languages like Bangla in automated code generation, though it is incremental as it adapts existing methods to a specific domain.

The authors tackled code generation from Bangla natural language instructions by proposing a test-driven, feedback-guided iterative refinement method using a fine-tuned Qwen2.5-14B model, achieving a Pass@1 score of 0.934 and securing 2nd place in a shared task.

Large Language Models (LLMs) have advanced the automated generation of code from natural language prompts. However, low-resource languages (LRLs) like Bangla remain underrepresented due to the limited availability of instruction-to-code datasets and evaluation benchmarks. To address this, the BLP Workshop at IJCNLP-AACL 2025 introduced a shared task on "Code Generation in Bangla". In this work, we propose a method that combines instruction prompting with a test-driven, feedback-guided iterative refinement process using a fine-tuned Qwen2.5-14B model. The model generates code from Bangla instructions, tests it against unit tests, and iteratively refines any failing outputs through three evaluation passes, using test feedback to guide each step. This approach helped our team "Retriv" to secure 2nd place in the shared task with a Pass@1 score of 0.934. The analysis highlights challenges in Bangla instruction understanding and Python code generation, emphasizing the need for targeted methods in LRLs. We made experimental scripts publicly available for the community.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes