SEAIMay 13

Neurosymbolic Auditing of Natural-Language Software Requirements

arXiv:2605.1381772.7Has Code
Predicted impact top 32% in SE · last 90 daysOriginality Incremental advance
AI Analysis

For safety-critical software engineering, this work provides a method to automatically detect and repair defects in natural-language requirements, reducing ambiguity and enabling formal verification.

The paper introduces VERIMED, a neurosymbolic pipeline that uses LLMs and SMT solvers to audit natural-language software requirements, detecting ambiguity, inconsistency, and safety violations. On a hemodialysis benchmark, counterexample-guided repair raises verified accuracy from 55.4% to 98.5%.

Natural-language software requirements are often ambiguous, inconsistent, and underspecified; in safety-critical domains, these defects propagate into formal models that verify the wrong specification and into implementations that ship unsafe behavior. We show that large language models, equipped with an SMT solver, can audit such requirements: translating them into formal logic, detecting ambiguity through stochastic variation in the generated formalization, and exposing inconsistency, vacuousness, and safety violations through solver queries on the resulting specification. We present VERIMED, a neurosymbolic pipeline that operationalizes this idea for medical-device software requirements, and report two findings. First, stochastic variation across independent formalizations is a signal of ambiguity: requirements that admit multiple plausible interpretations produce SMT-inequivalent formalizations, and bidirectional SMT equivalence checking turns this disagreement into a solver-checkable test. Second, the usefulness of symbolic feedback depends on its granularity: in counterexample-guided repair on a hemodialysis question-answering benchmark, concrete SMT counterexamples raise verified accuracy from 55.4% to 98.5%. Over an extensive experimental evaluation on open-source hemodialysis safety requirements, we show that the LLM-based approach in VERIMED successfully reduces ambiguity-sensitive requirements and enables rigorous auditing of software requirements through SMT-based queries.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes