SEAIApr 21, 2025

Integrating Symbolic Execution into the Fine-Tuning of Code-Generating LLMs

arXiv:2504.15210v212 citationsh-index: 2Has CodeNAACL
Originality Incremental advance
AI Analysis

This work addresses the problem of generating high-quality code for software developers, but it is incremental as it builds on existing methods like CodeRL.

The paper tackled improving code-generating LLMs by fine-tuning them with reinforcement learning and direct preference optimization, using symbolic execution to enhance reward model training data, resulting in significant improvements in reward model performance over the baseline CodeRL and similar results for the LLMs.

Code-generating Large Language Models (LLMs) have become essential tools in modern software development, enhancing productivity and accelerating development. This paper aims to investigate the fine-tuning of code-generating LLMs using Reinforcement Learning and Direct Preference Optimization, further improving their performance. To achieve this, we enhance the training data for the reward model with the help of symbolic execution techniques, ensuring more comprehensive and objective data. With symbolic execution, we create a custom dataset that better captures the nuances in code evaluation. Our reward models, fine-tuned on this dataset, demonstrate significant improvements over the baseline, CodeRL, in estimating the quality of generated code. Our code-generating LLMs, trained with the help of reward model feedback, achieve similar results compared to the CodeRL benchmark.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes