SEAICLLGDec 3, 2024

Does Few-Shot Learning Help LLM Performance in Code Synthesis?

arXiv:2412.02906v114 citationsh-index: 24
Originality Incremental advance
AI Analysis

This work addresses prompt-level optimizations for LLMs in code synthesis, offering practical methods for developers, but it is incremental as it builds on existing few-shot learning techniques.

The paper tackled the problem of optimizing few-shot examples in prompts for large language models (LLMs) in code generation, finding that systematic selection methods significantly improved CodeLlama's performance on the HumanEval+ benchmark.

Large language models (LLMs) have made significant strides at code generation through improved model design, training, and chain-of-thought. However, prompt-level optimizations remain an important yet under-explored aspect of LLMs for coding. This work focuses on the few-shot examples present in most code generation prompts, offering a systematic study on whether few-shot examples improve LLM's coding capabilities, which few-shot examples have the largest impact, and how to select impactful examples. Our work offers 2 approaches for selecting few-shot examples, a model-free method, CODEEXEMPLAR-FREE, and a model-based method, CODEEXEMPLAR-BASED. The 2 methods offer a trade-off between improved performance and reliance on training data and interpretability. Both methods significantly improve CodeLlama's coding ability across the popular HumanEval+ coding benchmark. In summary, our work provides valuable insights into how to pick few-shot examples in code generation prompts to improve LLM code generation capabilities.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes