LG AI LOFeb 19, 2021

TacticZero: Learning to Prove Theorems from Scratch with Deep Reinforcement Learning

Minchao Wu, Michael Norrish, Christian Walder, Amir Dezfouli

arXiv:2102.09756v220.451 citations

Originality Highly original

AI Analysis

This addresses the problem of efficient theorem proving for researchers and developers in formal verification, representing a novel method rather than an incremental improvement.

The paper tackles automated theorem proving by introducing a deep reinforcement learning framework that learns proof search strategies end-to-end, achieving results that outperform existing automated theorem provers in HOL4 on unseen problems.

We propose a novel approach to interactive theorem-proving (ITP) using deep reinforcement learning. The proposed framework is able to learn proof search strategies as well as tactic and arguments prediction in an end-to-end manner. We formulate the process of ITP as a Markov decision process (MDP) in which each state represents a set of potential derivation paths. This structure allows us to introduce a novel backtracking mechanism which enables the agent to efficiently discard (predicted) dead-end derivations and restart from promising alternatives. We implement the framework in the HOL4 theorem prover. Experimental results show that the framework outperforms existing automated theorem provers (i.e., hammers) available in HOL4 when evaluated on unseen problems. We further elaborate the role of key components of the framework using ablation studies.

View on arXiv PDF

Similar