SEAICLAug 2, 2024

ArchCode: Incorporating Software Requirements in Code Generation with Large Language Models

arXiv:2408.00994v130 citationsh-index: 6
Originality Highly original
AI Analysis

This addresses the challenge of ensuring code quality and compliance with requirements for developers using LLMs, representing a novel extension beyond basic code generation.

The paper tackles the problem of generating code that meets both functional and non-functional software requirements from textual descriptions, introducing ARCHCODE, which improves Pass@k scores on benchmarks and demonstrates superiority in handling non-functional requirements.

This paper aims to extend the code generation capability of large language models (LLMs) to automatically manage comprehensive software requirements from given textual descriptions. Such requirements include both functional (i.e. achieving expected behavior for inputs) and non-functional (e.g., time/space performance, robustness, maintainability) requirements. However, textual descriptions can either express requirements verbosely or may even omit some of them. We introduce ARCHCODE, a novel framework that leverages in-context learning to organize requirements observed in descriptions and to extrapolate unexpressed requirements from them. ARCHCODE generates requirements from given descriptions, conditioning them to produce code snippets and test cases. Each test case is tailored to one of the requirements, allowing for the ranking of code snippets based on the compliance of their execution results with the requirements. Public benchmarks show that ARCHCODE enhances to satisfy functional requirements, significantly improving Pass@k scores. Furthermore, we introduce HumanEval-NFR, the first evaluation of LLMs' non-functional requirements in code generation, demonstrating ARCHCODE's superiority over baseline methods. The implementation of ARCHCODE and the HumanEval-NFR benchmark are both publicly accessible.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes