Vasileios Koutavas

5.9PLApr 28

Open-World Assertion Checking for Smart Contracts via Game Semantics

Vasileios Koutavas, Yu-Yang Lin, Nikos Tzevelekos

We present a game semantics framework for open-world safety analysis of Ethereum smart contracts. We model the interaction between a contract and its environment as a two-player game between the contract and the environment, and prove up to gas model approximations soundness: every assertion violation found corresponds to a real execution; and completeness: every open-world execution is captured. To our knowledge, this provides the first formal open-world interaction semantics for Ethereum smart contracts with mathematical guarantees of soundness and completeness. We implement this framework in YulTracer, an assertion reachability tool for real-world Solidity contracts, built on Yul, the intermediate language of the Solidity compiler. YulTracer uses concrete execution and exhaustively explores game traces within user-specified bounds. We evaluate it on reentrancy benchmarks, where YulTracer achieves 100% recall and precision -- the only tool to do so from those we examined -- and on two large real-world exploits (the DAO and PredyPool), where it detects the known vulnerabilities and produces no false positives on fixed versions. To our knowledge, YulTracer is the first tool to achieve this level of precision on real-world contracts without false positives. We additionally demonstrate generality of the approach via the examination of access control benchmarks.

5.9SEJul 3, 2025

FuzzFeed: An Automatic Approach to Weakest Precondition Generation using LLMs and Fuzzing

Daragh King, Vasileios Koutavas, Laura Kovacs

The weakest precondition (WP) of a program describes the largest set of initial states from which all terminating executions of the program satisfy a given postcondition. The generation of WPs is an important task with practical applications in areas ranging from verification to run-time error checking. This paper proposes the combination of Large Language Models (LLMs) and fuzz testing for generating WPs. In pursuit of this goal, we introduce Fuzzing Guidance (FG); FG acts as a means of directing LLMs towards correct WPs using program execution feedback. FG utilises fuzz testing for approximately checking the validity and weakness of candidate WPs, this information is then fed back to the LLM as a means of context refinement. We demonstrate the effectiveness of our approach on a comprehensive benchmark set of deterministic array programs in Java. Our experiments indicate that LLMs are capable of producing viable candidate WPs, and that this ability can be practically enhanced through FG.

Vasileios Koutavas

2 Papers