CLAISEFeb 20, 2025

Pragmatic Reasoning improves LLM Code Generation

arXiv:2502.15835v36 citationsh-index: 34
Originality Incremental advance
AI Analysis

This addresses the challenge of generating accurate code from ambiguous instructions for developers using LLMs, representing an incremental improvement in reranking techniques.

The paper tackles the problem of ambiguous user instructions in LLM code generation by proposing CodeRSA, a reranking mechanism based on the Rational Speech Act framework, which improves performance on benchmarks like HumanEval and MBPP, outperforming baselines and often surpassing state-of-the-art methods.

Large Language Models (LLMs) have demonstrated impressive potential in translating natural language (NL) instructions into program code. However, user instructions often contain inherent ambiguities, making it challenging for LLMs to generate code that accurately reflects the user's true intent. To address this challenge, researchers have proposed approaches that produce multiple candidates of the program code and then rerank them to identify the best solution. In this paper, we propose CodeRSA, a novel code candidate reranking mechanism built upon the Rational Speech Act (RSA) framework, designed to guide LLMs toward more comprehensive pragmatic reasoning about user intent. We evaluate CodeRSA using Llama-3-8B-Instruct and Qwen-2.5-7B-Instruct on two widely used code generation benchmarks, HumanEval and MBPP. Our experiment results show that CodeRSA consistently outperforms common baselines, surpasses the state-of-the-art approach in most cases, and demonstrates robust overall performance. These findings underscore the effectiveness of integrating pragmatic reasoning into code candidate reranking, offering a promising direction for enhancing code generation quality in LLMs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes