Alessandro Sosso

2papers

2 Papers

45.6AIMay 22
Agentic Proving for Program Verification

Alessandro Sosso, Akhil Arora, Bas Spitters

Agentic systems have recently emerged as state-of-the-art approaches for automated theorem proving in formal mathematics. To assess how far these capabilities extend to program verification, we evaluate Claude Code in an agentic proving framework on CLEVER, a Lean 4 benchmark for verifiable code generation. Our results show that Claude generates arguably valid specifications for 98.8% of problems (with 81.3% also accepted by CLEVER's isomorphism-based scoring on the correct portion of the benchmark), certifies implementations against correct ground-truth specifications for 87.5% of problems, and reaches a 98.1% success rate on the end-to-end program generation and verification pipeline over entries with self-consistent premises. Across all stages, Claude further provides high-quality feedback on its own attempts (as confirmed under manual review), identifying underlying causes of failure and lingering bugs in the dataset. These findings highlight a growing mismatch between the difficulty of existing program verification benchmarks and the capabilities of modern agentic provers, and point to the need for more rigorous, bug-resilient evaluation methodologies, and in particular for alternatives to isomorphism-based scoring of generated specifications. More broadly, our results provide empirical evidence that tight compiler-in-the-loop agentic paradigms are currently the most effective approach for foundational program verification.

50.6GTMar 17
Faster Algorithms for the Least-Core value and the Nucleolus in Convex Games

Giacomo Maggiorano, Alessandro Sosso, Gautier Stauffer

The nucleolus is a central solution concept in cooperative game theory. While its computation is NP-hard in general, it can be computed in polynomial time for convex games; however, the only published polynomial-time algorithm relies on the ellipsoid method. We develop a combinatorial alternative based on reduced games and iterative least-core value computations. Leveraging submodular function minimization and polyhedral structure in a novel way, we obtain a faster combinatorial algorithm for computing the least-core value, improving the oracle complexity by a factor $n^3$ over previous approaches. As a consequence, we obtain a new strongly polynomial-time and combinatorial algorithm for computing the nucleolus in convex games. Preliminary analysis indicates an improved oracle complexity compared to the ellipsoid-based algorithm.