Laura Kovacs

LO
h-index11
3papers
1citation
Novelty58%
AI Score45

3 Papers

PLDec 17, 2025
A Neurosymbolic Approach to Loop Invariant Generation via Weakest Precondition Reasoning

Daragh King, Vasileios Koutavas, Laura Kovacs

Loop invariant generation remains a critical bottleneck in automated program verification. Recent work has begun to explore the use of Large Language Models (LLMs) in this area, yet these approaches tend to lack a reliable and structured methodology, with little reference to existing program verification theory. This paper presents NeuroInv, a neurosymbolic approach to loop invariant generation. NeuroInv comprises two key modules: (1) a neural reasoning module that leverages LLMs and Hoare logic to derive and refine candidate invariants via backward-chaining weakest precondition reasoning, and (2) a verification-guided symbolic module that iteratively repairs invariants using counterexamples from OpenJML. We evaluate NeuroInv on a comprehensive benchmark of 150 Java programs, encompassing single and multiple (sequential) loops, multiple arrays, random branching, and noisy code segments. NeuroInv achieves a $99.5\%$ success rate, substantially outperforming the other evaluated approaches. Additionally, we introduce a hard benchmark of $10$ larger multi-loop programs (with an average of $7$ loops each); NeuroInv's performance in this setting demonstrates that it can scale to more complex verification scenarios.

47.0LOMay 15
Certificate-Aware Property-Directed Reachability

Arman Ferdowsi, Laura Kovacs

Property-Directed Reachability (PDR/IC3) is a standard workhorse for hardware safety verification, but most implementations are tuned primarily for time-to-answer and treat the produced invariant or counterexample as a secondary byproduct. In certified workflows, including recent hardware model checking competition rules, the certificate becomes a deliverable whose size, independent checking time, and reproducibility directly affect end-to-end cost. We present CAPDR, a certificate-aware variant of PDR that targets a joint objective over runtime, certificate size, and checker time, while keeping learning outside the trusted computing base. CAPDR exposes a small set of PDR choice points (blocker generalization, obligation ordering, clause pushing, and optional extensions) to a learned ranking policy, but preserves trust by design: every state-changing action is guarded by the same SAT checks as standard PDR, and a SAFE/UNSAFE claim is reported only after an independent checker validates the emitted invariant or trace. We formalize certificate-centric metrics and a replay log that records nondeterministic choices for artifact-grade reproducibility. On the 2024 Hardware Model Checking Competition bit-level safety benchmarks, CAPDR solves six more instances than the baseline. Over each configuration's checker-accepted solved set, the median certificate-size proxy decreases by 24.6% and the median checker time by 49%. Post-fixpoint invariant minimization yields further reductions.

SEJul 3, 2025
FuzzFeed: An Automatic Approach to Weakest Precondition Generation using LLMs and Fuzzing

Daragh King, Vasileios Koutavas, Laura Kovacs

The weakest precondition (WP) of a program describes the largest set of initial states from which all terminating executions of the program satisfy a given postcondition. The generation of WPs is an important task with practical applications in areas ranging from verification to run-time error checking. This paper proposes the combination of Large Language Models (LLMs) and fuzz testing for generating WPs. In pursuit of this goal, we introduce Fuzzing Guidance (FG); FG acts as a means of directing LLMs towards correct WPs using program execution feedback. FG utilises fuzz testing for approximately checking the validity and weakness of candidate WPs, this information is then fed back to the LLM as a means of context refinement. We demonstrate the effectiveness of our approach on a comprehensive benchmark set of deterministic array programs in Java. Our experiments indicate that LLMs are capable of producing viable candidate WPs, and that this ability can be practically enhanced through FG.