CLPLSEFeb 17, 2025

Building A Proof-Oriented Programmer That Is 64% Better Than GPT-4o Under Data Scarcity

arXiv:2502.11901v23 citationsh-index: 3ACL
Originality Incremental advance
AI Analysis

This addresses the challenge of training LMs for proof-oriented programming under data scarcity, which is incremental as it builds on existing methods with a novel data synthesis approach.

The paper tackles the problem of data scarcity in proof-oriented programming for language models by introducing a synthetic data augmentation method, resulting in a fine-tuned 14B parameter model, PoPilot, that outperforms GPT-4o by 64% in project-level tasks and improves GPT-4o's performance by 54% through repair.

Existing LMs struggle with proof-oriented programming due to data scarcity, which manifest in two key ways: (1) a lack of sufficient corpora for proof-oriented programming languages such as F*, and (2) the absence of large-scale, project-level proof-oriented implementations that can teach the model the intricate reasoning process when performing proof-oriented programming. We present the first on synthetic data augmentation for project level proof oriented programming for both generation and repair. Our method addresses data scarcity by synthesizing basic proof-oriented programming problems for proficiency in that language; incorporating diverse coding data for reasoning capability elicitation and creating new proofs and repair data within existing repositories. This approach enables language models to both synthesize and repair proofs for function- and repository-level code. We show that our fine-tuned 14B parameter model, PoPilot, can exceed the performance of the models that outperforms GPT-4o in project-level proof-oriented programming by 64% relative margin, and can improve GPT-4o's performance by 54% by repairing its outputs over GPT-4o's self-repair.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes