Lemur: Integrating Large Language Models in Automated Program Verification
This addresses the challenge of high-level abstract reasoning in program verification for developers and verification tool users, representing an incremental advance.
The authors tackled automated program verification by integrating large language models with automated reasoners, demonstrating practical improvements on synthetic and competition benchmarks.
The demonstrated code-understanding capability of LLMs raises the question of whether they can be used for automated program verification, a task that demands high-level abstract reasoning about program properties that is challenging for verification tools. We propose a general methodology to combine the power of LLMs and automated reasoners for automated program verification. We formally describe this methodology as a set of transition rules and prove its soundness. We instantiate the calculus as a sound automated verification procedure and demonstrate practical improvements on a set of synthetic and competition benchmarks.