SE LG PLNov 16, 2025

Enhancing LLM Code Generation Capabilities through Test-Driven Development and Code Interpreter

Sajed Jalil, Shuvo Saha, Hossain Mohammad Seym

arXiv:2511.12823v1

Originality Incremental advance

AI Analysis

It democratizes access to code generation tools for Bengali speakers in resource-constrained emerging markets, though it is incremental as it builds on existing techniques.

The paper tackles the problem of low-resource code generation for Bengali by introducing a method combining Test-Driven Development and Code Interpreter, achieving 85% accuracy with Bengali prompts and showing that small models can reach up to 98% of the performance of large models.

Over the past few years, improving LLM code generation capabilities has been a key focus in NLP research. Despite Bengali having 242 million native speakers worldwide, it receives little attention when it comes to training LLMs. More recently, various fine-tuning and augmented generation techniques have been employed to significantly enhance code generation performance. However, they require considerable expertise and resources to utilize effectively as an end user. The goal of our work is to democratize access to powerful code generation tools in resource-constrained emerging markets, enabling users to leverage them in their native language. We introduce a novel approach that combines Test-Driven Development (TDD) and Code Interpreter (CI), utilizing open-weight models, which improves the baseline accuracy for code generation with Bengali prompts and achieves an overall accuracy of 85%. Our approach requires no finetuning and proves that even the smallest models in the same family can attain up to 98% accuracy compared to the largest models. All of our results are publicly shared in GitHub for validation and reproducibility.

View on arXiv PDF

Similar