SEAIFLPLFeb 21, 2025

On the Effectiveness of Large Language Models in Writing Alloy Formulas

arXiv:2502.15441v110 citationsh-index: 52
Originality Synthesis-oriented
AI Analysis

This addresses the problem of specification writing for software developers, offering a potentially useful tool, though it appears incremental as it applies existing LLMs to a new domain without major methodological breakthroughs.

The paper tackles the challenge of writing correct declarative specifications by evaluating large language models (LLMs) like ChatGPT and DeepSeek in synthesizing Alloy formulas from natural language descriptions, creating equivalent formulas, and completing formula sketches, with results showing that LLMs generally perform well and can enumerate multiple unique solutions.

Declarative specifications have a vital role to play in developing safe and dependable software systems. Writing specifications correctly, however, remains particularly challenging. This paper presents a controlled experiment on using large language models (LLMs) to write declarative formulas in the well-known language Alloy. Our use of LLMs is three-fold. One, we employ LLMs to write complete Alloy formulas from given natural language descriptions (in English). Two, we employ LLMs to create alternative but equivalent formulas in Alloy with respect to given Alloy formulas. Three, we employ LLMs to complete sketches of Alloy formulas and populate the holes in the sketches by synthesizing Alloy expressions and operators so that the completed formulas accurately represent the desired properties (that are given in natural language). We conduct the experimental evaluation using 11 well-studied subject specifications and employ two popular LLMs, namely ChatGPT and DeepSeek. The experimental results show that the LLMs generally perform well in synthesizing complete Alloy formulas from input properties given in natural language or in Alloy, and are able to enumerate multiple unique solutions. Moreover, the LLMs are also successful at completing given sketches of Alloy formulas with respect to natural language descriptions of desired properties (without requiring test cases). We believe LLMs offer a very exciting advance in our ability to write specifications, and can help make specifications take a pivotal role in software development and enhance our ability to build robust software.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes