LOApr 17
Just Type It in Isabelle! AI Agents Drafting, Mechanizing, and Generalizing from Human HintsKevin Kappelmann, Maximilian Schäffeler, Lukas Stevens et al.
Type annotations are essential when printing terms in a way that preserves their meaning under reparsing and type inference. We study the problem of complete and minimal type annotations for rank-one polymorphic $λ$-calculus terms, as used in Isabelle. Building on prior work by Smolka, Blanchette et al., we give a metatheoretical account of the problem, with a full formal specification and proofs, and formalize it in Isabelle/HOL. Our development is a series of experiments featuring human-driven and AI-driven formalization workflows: a human and an LLM-powered AI agent independently produce pen-and-paper proofs, and the AI agent autoformalizes both in Isabelle, with further human-hinted AI interventions refining and generalizing the development.
AIMar 25, 2022
Formal Semantics and Formally Verified Validation for Temporal PlanningMohammad Abdulaziz, Lukas Koller
We present a simple and concise semantics for temporal planning. Our semantics are developed and formalised in the logic of the interactive theorem prover Isabelle/HOL. We derive from those semantics a validation algorithm for temporal planning and show, using a formal proof in Isabelle/HOL, that this validation algorithm implements our semantics. We experimentally evaluate our verified validation algorithm and show that it is practical.
AIJun 5, 2022
Formally Verified Solution Methods for Infinite-Horizon Markov Decision ProcessesMaximilian Schäfeller, Mohammad Abdulaziz
We formally verify executable algorithms for solving Markov decision processes (MDPs) in the interactive theorem prover Isabelle/HOL. We build on existing formalizations of probability theory to analyze the expected total reward criterion on infinite-horizon problems. Our developments formalize the Bellman equation and give conditions under which optimal policies exist. Based on this analysis, we verify dynamic programming algorithms to solve tabular MDPs. We evaluate the formally verified implementations experimentally on standard problems and show they are practical. Furthermore, we show that, combined with efficient unverified implementations, our system can compete with and even outperform state-of-the-art systems.
LOApr 22
Formal Primal-Dual Algorithm AnalysisMohammad Abdulaziz, Thomas Ammer
We present an ongoing effort to build a framework and a library in Isabelle/HOL for formalising primal-dual arguments for the analysis of algorithms. We discuss a number of example formalisations from the theory of matching algorithms, covering classical algorithms like the Hungarian Method, widely considered the first primal-dual algorithm, and modern algorithms like the Adwords algorithm, which models the assignment of search queries to advertisers in the context of search engines.
LOOct 11, 2025
Formally Verified Certification of Unsolvability of Temporal Planning ProblemsDavid Wang, Mohammad Abdulaziz
We present an approach to unsolvability certification of temporal planning. Our approach is based on encoding the planning problem into a network of timed automata, and then using an efficient model checker on the network followed by a certificate checker to certify the output of the model checker. Our approach prioritises trustworthiness of the certification: we formally verify our implementation of the encoding to timed automata using the theorem prover Isabelle/HOL and we use an existing certificate checker (also formally verified in Isabelle/HOL) to certify the model checking result.
AIJun 11, 2024
Formally Verified Approximate Policy IterationMaximilian Schäffeler, Mohammad Abdulaziz
We formally verify an algorithm for approximate policy iteration on Factored Markov Decision Processes using the interactive theorem prover Isabelle/HOL. Next, we show how the formalized algorithm can be refined to an executable, verified implementation. The implementation is evaluated on benchmark problems to show its practicability. As part of the refinement, we develop verified software to certify Linear Programming solutions. The algorithm builds on a diverse library of formalized mathematics and pushes existing methodologies for interactive theorem provers to the limits. We discuss the process of the verification project and the modifications to the algorithm needed for formal verification.
AIMar 3, 2021
Cost Optimal Planning as SatisfiabilityMohammad Abdulaziz
We investigate upper bounds on the length of cost optimal plans that are valid for problems with 0-cost actions. We employ these upper bounds as horizons for a SAT-based encoding of planning with costs. Given an initial upper bound on the cost of the optimal plan, we experimentally show that this SAT-based approach is able to compute plans with better costs, and in many cases it can match the optimal cost. Also, in multiple instances, the approach is successful in proving that a certain cost is the optimal plan cost.
AIOct 27, 2020
Formally Verified SAT-Based AI PlanningMohammad Abdulaziz, Friedrich Kurz
We present an executable formally verified SAT encoding of classical AI planning. We use the theorem prover Isabelle/HOL to perform the verification. We experimentally test the verified encoding and show that it can be used for reasonably sized standard planning benchmarks. We also use it as a reference to test a state-of-the-art SAT-based planner, showing that it sometimes falsely claims that problems have no solutions of certain lengths.
AIJun 1, 2020
Computing Plan-Length Bounds Using Lengths of Longest PathsMohammad Abdulaziz, Dominik Berger
We devise a method to exactly compute the length of the longest simple path in factored state spaces, like state spaces encountered in classical planning. Although the complexity of this problem is NEXP-Hard, we show that our method can be used to compute practically useful upper-bounds on lengths of plans. We show that the computed upper-bounds are significantly (in many cases, orders of magnitude) better than bounds produced by previous bounding techniques and that they can be used to improve the SAT-based planning.
DSJul 9, 2019
Trustworthy Graph AlgorithmsMohammad Abdulaziz, Kurt Mehlhorn, Tobias Nipkow
The goal of the LEDA project was to build an easy-to-use and extendable library of correct and efficient data structures, graph algorithms and geometric algorithms. We report on the use of formal program verification to achieve an even higher level of trustworthiness. Specifically, we report on an ongoing and largely finished verification of the blossom-shrinking algorithm for maximum cardinality matching.