Breaking the Myth: Can Small Models Infer Postconditions Too?
This addresses the challenge of high computational costs for automated specification generation in software engineering, offering a practical and efficient alternative to large models, though it is incremental as it builds on existing fine-tuning methods.
The paper tackles the problem of generating formal specifications for software correctness by showing that a small, fine-tuned 7B-parameter language model can achieve high-quality postcondition generation, matching or outperforming larger models like GPT-4o on benchmarks such as Defects4J in syntax correctness, semantic correctness, and bug-distinguishing capability.
Formal specifications are essential for ensuring software correctness, yet manually writing them is tedious and error-prone. Large Language Models (LLMs) have shown promise in generating such specifications from natural language intents, but the giant model size and high computational demands raise a fundamental question: Do we really need large models for this task? In this paper, we show that a small, fine-tuned language model can achieve high-quality postcondition generation with much lower computational costs. We construct a specialized dataset of prompts, reasoning logs, and postconditions, then supervise the fine-tuning of a $7$B-parameter code model. Our approach tackles real-world repository dependencies and preserves pre-state information, allowing for expressive and accurate specifications. We evaluate the model on a benchmark of real-world Java bugs (Defects4J) and compare against both proprietary giants (e.g., GPT-4o) and open-source large models. Empirical results demonstrate that our compact model matches or outperforms significantly larger counterparts in syntax correctness, semantic correctness, and bug-distinguishing capability. These findings highlight that targeted fine-tuning on a modest dataset can enable small models to achieve results formerly seen only in massive, resource-heavy LLMs, offering a practical and efficient path for the real-world adoption of automated specification generation.