SE AI CL LGDec 3, 2024

Does Few-Shot Learning Help LLM Performance in Code Synthesis?

Derek Xu, Tong Xie, Botao Xia, Haoyu Li, Yunsheng Bai, Yizhou Sun, Wei Wang

arXiv:2412.02906v19.814 citationsh-index: 24

Originality Incremental advance

AI Analysis

This work addresses prompt-level optimizations for LLMs in code synthesis, offering practical methods for developers, but it is incremental as it builds on existing few-shot learning techniques.

The paper tackled the problem of optimizing few-shot examples in prompts for large language models (LLMs) in code generation, finding that systematic selection methods significantly improved CodeLlama's performance on the HumanEval+ benchmark.

Large language models (LLMs) have made significant strides at code generation through improved model design, training, and chain-of-thought. However, prompt-level optimizations remain an important yet under-explored aspect of LLMs for coding. This work focuses on the few-shot examples present in most code generation prompts, offering a systematic study on whether few-shot examples improve LLM's coding capabilities, which few-shot examples have the largest impact, and how to select impactful examples. Our work offers 2 approaches for selecting few-shot examples, a model-free method, CODEEXEMPLAR-FREE, and a model-based method, CODEEXEMPLAR-BASED. The 2 methods offer a trade-off between improved performance and reliance on training data and interpretability. Both methods significantly improve CodeLlama's coding ability across the popular HumanEval+ coding benchmark. In summary, our work provides valuable insights into how to pick few-shot examples in code generation prompts to improve LLM code generation capabilities.

View on arXiv PDF

Similar