SEAICLLGMar 31, 2024

The Larger the Better? Improved LLM Code-Generation via Budget Reallocation

arXiv:2404.00725v250 citationsh-index: 33
Originality Incremental advance
AI Analysis

This work addresses the efficiency and cost challenges in deploying large language models for code generation, offering a practical approach for resource-constrained scenarios, though it is incremental as it builds on existing model comparison and selection methods.

The study tackled the problem of whether larger language models are always better under fixed compute budgets by comparing code generation across model sizes, finding that repeated use of smaller models with unit-test selection can improve performance by up to 15% on five tasks, but ranking-based selection without unit-tests underperforms larger models.

It is a common belief that large language models (LLMs) are better than smaller-sized ones. However, larger models also require significantly more time and compute during inference. This begs the question: what happens when both models operate under the same budget? (e.g., compute, run-time). To address this question, we analyze code generation LLMs of various sizes and make comparisons such as running a 70B model once vs. generating five outputs from a 13B model. We consider a standard unit-test setup, which can be used to select the correct output from the smaller model. Our findings reveal that the repeated use of smaller models can yield consistent improvements, with gains of up to 15% across five tasks. On the other hand, in scenarios where unit-tests are unavailable, a ranking-based selection of candidates from the smaller model falls short of the performance of a single output from larger ones. Our results highlight the potential of using smaller models instead of larger ones, and the importance of studying approaches for ranking LLM outputs.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes