CodeSift: An LLM-Based Reference-Less Framework for Automatic Code Validation
This addresses the challenge of validating large volumes of generated code for developers and researchers, offering an automated solution, though it appears incremental as it builds on existing LLM capabilities.
The paper tackles the problem of ensuring functional correctness in LLM-generated code by introducing CodeSift, a reference-less framework that uses LLMs as a first-line filter without execution, reducing validation effort; results show it outperforms state-of-the-art methods across three datasets in two programming languages and aligns with human preference.
The advent of large language models (LLMs) has greatly facilitated code generation, but ensuring the functional correctness of generated code remains a challenge. Traditional validation methods are often time-consuming, error-prone, and impractical for large volumes of code. We introduce CodeSift, a novel framework that leverages LLMs as the first-line filter of code validation without the need for execution, reference code, or human feedback, thereby reducing the validation effort. We assess the effectiveness of our method across three diverse datasets encompassing two programming languages. Our results indicate that CodeSift outperforms state-of-the-art code evaluation methods. Internal testing conducted with subject matter experts reveals that the output generated by CodeSift is in line with human preference, reinforcing its effectiveness as a dependable automated code validation tool.