SE AI LG PLFeb 21, 2025

FormalSpecCpp: A Dataset of C++ Formal Specifications created using LLMs

Madhurima Chakraborty, Peter Pirkelbauer, Qing Yi

arXiv:2502.15217v15 citationsh-index: 10Has CodeMSR

Originality Synthesis-oriented

AI Analysis

This dataset addresses a gap for researchers and developers working on specification inference tools and AI-assisted software development, but it is incremental as it provides a new resource rather than a novel method.

The authors tackled the lack of standardized benchmarks for verifying formal specifications in C++ programs by creating FormalSpecCpp, a comprehensive dataset of C++ programs with preconditions and postconditions, which they made publicly available to advance research in program verification and AI-assisted software development.

FormalSpecCpp is a dataset designed to fill the gap in standardized benchmarks for verifying formal specifications in C++ programs. To the best of our knowledge, this is the first comprehensive collection of C++ programs with well-defined preconditions and postconditions. It provides a structured benchmark for evaluating specification inference tools and testing theaccuracy of generated specifications. Researchers and developers can use this dataset to benchmark specification inference tools,fine-tune Large Language Models (LLMs) for automated specification generation, and analyze the role of formal specifications in improving program verification and automated testing. By making this dataset publicly available, we aim to advance research in program verification, specification inference, and AI-assisted software development. The dataset and the code are available at https://github.com/MadhuNimmo/FormalSpecCpp.

View on arXiv PDF Code

Similar