Prover Agent: An Agent-Based Framework for Formal Mathematical Proofs
This work addresses the problem of automating formal mathematical proofs for researchers and practitioners in AI and mathematics, representing a strong specific gain rather than a broad breakthrough.
The paper tackles automated theorem proving by introducing Prover Agent, an AI agent that integrates large language models with the Lean proof assistant to generate auxiliary lemmas and coordinate reasoning, achieving an 88.1% success rate on the MiniF2F benchmark, setting a new state-of-the-art for methods using small language models with lower sample budgets.
We present Prover Agent, a novel AI agent for automated theorem proving that integrates large language models (LLMs) with a formal proof assistant, Lean. Prover Agent coordinates an informal reasoning LLM, a formal prover model, and feedback from Lean while also generating auxiliary lemmas. These auxiliary lemmas are not limited to subgoals in the formal proof but can also include special cases or potentially useful facts derived from the assumptions, which help in discovering a viable proof strategy. It achieves an 88.1% success rate on the MiniF2F benchmark, establishing a new state-of-the-art among methods using small language models (SLMs) with a much lower sample budget than previous approaches. We also present theoretical analyses and case studies that illustrate how these generated lemmas contribute to solving challenging problems. Our code is publicly available at: https://github.com/kAIto47802/Prover-Agent.