SEApr 22

Are Decoder-Only Large Language Models the Silver Bullet for Code Search?

arXiv:2410.2224068.17 citationsh-index: 18
Predicted impact top 22% in SE · last 90 daysOriginality Incremental advance
AI Analysis

This provides a practical guide for developers and researchers on optimizing LLMs for code search, though it is incremental as it compares existing model types.

The paper systematically evaluated decoder-only large language models for code search, finding that fine-tuned models like CodeGemma outperform encoder-only models by 40.4% in Mean Average Precision on the CoSQA+ benchmark.

Code search is essential for code reuse, allowing developers to efficiently locate relevant code snippets. The advent of powerful decoder-only Large Language Models (LLMs) has revolutionized many code intelligence tasks. However, their effectiveness for the retrieval-based task of code search, particularly compared to established encoder-based models, remains underexplored. This paper addresses this gap by presenting a large-scale systematic evaluation of eleven decoder-only LLMs, analyzing their performance across zero-shot and fine-tuned settings. Our results show that fine-tuned decoder-only models, particularly CodeGemma, significantly outperform encoder-only models like UniXcoder, achieving a 40.4% higher Mean Average Precision (MAP) on the CoSQA$^+$ benchmark. Our analysis further reveals two crucial nuances for practitioners: first, the relationship between model size and performance is non-monotonic, with mid-sized models often outperforming larger variants; second, the composition of the training data is critical, as a multilingual dataset enhances generalization while a small amount of data from a specific language can act as noise and interfere with model effectiveness. These findings offer a comprehensive guide to selecting and optimizing modern LLMs for code search.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes