LO AI LGMar 8, 2023

nl2spec: Interactively Translating Unstructured Natural Language to Temporal Logics with Large Language Models

Matthias Cosler, Christopher Hahn, Daniel Mendoza, Frederik Schmitt, Caroline Trippel

arXiv:2303.04864v122.8124 citationsh-index: 16Has Code

Originality Incremental advance

AI Analysis

This addresses the error-prone and time-consuming task of formal specification writing for verification engineers, though it is incremental as it builds on existing LLM capabilities for a specific domain.

The authors tackled the problem of manually writing formal specifications for system verification by developing nl2spec, a framework that uses Large Language Models to translate unstructured natural language into temporal logics, with a user study showing it facilitates error correction through interactive sub-translation edits.

A rigorous formalization of desired system requirements is indispensable when performing any verification task. This often limits the application of verification techniques, as writing formal specifications is an error-prone and time-consuming manual task. To facilitate this, we present nl2spec, a framework for applying Large Language Models (LLMs) to derive formal specifications (in temporal logics) from unstructured natural language. In particular, we introduce a new methodology to detect and resolve the inherent ambiguity of system requirements in natural language: we utilize LLMs to map subformulas of the formalization back to the corresponding natural language fragments of the input. Users iteratively add, delete, and edit these sub-translations to amend erroneous formalizations, which is easier than manually redrafting the entire formalization. The framework is agnostic to specific application domains and can be extended to similar specification languages and new neural models. We perform a user study to obtain a challenging dataset, which we use to run experiments on the quality of translations. We provide an open-source implementation, including a web-based frontend.

View on arXiv PDF

Similar