PLCLOct 5, 2023

Trustworthy Formal Natural Language Specifications

arXiv:2310.03885v14 citationsh-index: 2
Originality Incremental advance
AI Analysis

This addresses the problem of mistranslation and auditability in formal verification for software developers and researchers, though it is incremental as it builds on existing proof assistant principles.

The paper tackles the challenge of translating natural language specifications into formal claims for proof assistants by developing a method to write specifications in a formal subset of English within Lean, automatically translating them with proof certificates. It successfully applied this prototype to translate textbook specifications correctly using a modest lexicon.

Interactive proof assistants are computer programs carefully constructed to check a human-designed proof of a mathematical claim with high confidence in the implementation. However, this only validates truth of a formal claim, which may have been mistranslated from a claim made in natural language. This is especially problematic when using proof assistants to formally verify the correctness of software with respect to a natural language specification. The translation from informal to formal remains a challenging, time-consuming process that is difficult to audit for correctness. This paper shows that it is possible to build support for specifications written in expressive subsets of natural language, within existing proof assistants, consistent with the principles used to establish trust and auditability in proof assistants themselves. We implement a means to provide specifications in a modularly extensible formal subset of English, and have them automatically translated into formal claims, entirely within the Lean proof assistant. Our approach is extensible (placing no permanent restrictions on grammatical structure), modular (allowing information about new words to be distributed alongside libraries), and produces proof certificates explaining how each word was interpreted and how the sentence's structure was used to compute the meaning. We apply our prototype to the translation of various English descriptions of formal specifications from a popular textbook into Lean formalizations; all can be translated correctly with a modest lexicon with only minor modifications related to lexicon size.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes