Formally Verified Approximate Policy Iteration
This work addresses the need for formally verified reinforcement learning algorithms, though it is incremental as it builds on existing methodologies.
The researchers tackled the problem of verifying approximate policy iteration algorithms for Factored Markov Decision Processes by formally verifying an algorithm using Isabelle/HOL and refining it to an executable implementation, which was evaluated on benchmark problems to demonstrate practicability.
We formally verify an algorithm for approximate policy iteration on Factored Markov Decision Processes using the interactive theorem prover Isabelle/HOL. Next, we show how the formalized algorithm can be refined to an executable, verified implementation. The implementation is evaluated on benchmark problems to show its practicability. As part of the refinement, we develop verified software to certify Linear Programming solutions. The algorithm builds on a diverse library of formalized mathematics and pushes existing methodologies for interactive theorem provers to the limits. We discuss the process of the verification project and the modifications to the algorithm needed for formal verification.