CL AI LG LOMay 27, 2025

REAL-Prover: Retrieval Augmented Lean Prover for Mathematical Reasoning

Ziju Shen, Naohao Huang, Fanyi Yang, Yutong Wang, Guoxiong Gao, Tianyi Xu, Jiedong Jiang, Wanyi He, Pu Yang, Mengzhou Sun, Haocheng Ju, Peihao Wu

arXiv:2505.20613v217.010 citationsh-index: 4Has Code

Originality Incremental advance

AI Analysis

This work addresses the problem of limited generalization in theorem provers for college-level mathematics, representing an incremental advance with new benchmarks and competitive performance.

The authors tackled the challenge of generalizing formal theorem provers to advanced mathematics by introducing REAL-Prover, a stepwise theorem prover for Lean 4 that achieved a 23.7% success rate on ProofNet and a state-of-the-art 56.7% success rate on the new FATE-M benchmark.

Nowadays, formal theorem provers have made monumental progress on high-school and competition-level mathematics, but few of them generalize to more advanced mathematics. In this paper, we present REAL-Prover, a new open-source stepwise theorem prover for Lean 4 to push this boundary. This prover, based on our fine-tuned large language model (REAL-Prover-v1) and integrated with a retrieval system (Leansearch-PS), notably boosts performance on solving college-level mathematics problems. To train REAL-Prover-v1, we developed HERALD-AF, a data extraction pipeline that converts natural language math problems into formal statements, and a new open-source Lean 4 interactive environment (Jixia-interactive) to facilitate synthesis data collection. In our experiments, our prover using only supervised fine-tune achieves competitive results with a 23.7% success rate (Pass@64) on the ProofNet dataset-comparable to state-of-the-art (SOTA) models. To further evaluate our approach, we introduce FATE-M, a new benchmark focused on algebraic problems, where our prover achieves a SOTA success rate of 56.7% (Pass@64).

View on arXiv PDF Code

Similar