LO CL LG PLFeb 11, 2025

MiniF2F in Rocq: Automatic Translation Between Proof Assistants -- A Case Study

Jules Viennot, Guillaume Baudart, Emilio Jesùs Gallego Arias, Marc Lelarge

arXiv:2503.04763v13 citationsh-index: 1Has Code

Originality Synthesis-oriented

AI Analysis

This work addresses the challenge of interoperability between proof assistants for researchers and developers in formal verification, though it is incremental as it applies existing LLM methods to a new translation task.

The researchers tackled the problem of automatically translating mathematical theorems between proof assistants by using state-of-the-art LLMs to convert MiniF2F into Rocq, achieving a success rate of 478 out of 488 theorems translated.

In this work, we conduct an experiment using state-of-the-art LLMs to translate MiniF2F into Rocq. The translation task focuses on generating a Rocq theorem based on three sources: a natural language description, the Lean formalization, and the Isabelle formalization. We conducted our experiment in 3 stages of increasing complexity, from basic one-shot prompting to multi-turn conversations that incorporate feedback from unsuccessful attempts. At each stage, we perform multiple rounds of translation using increasingly advanced models: GPT-4o mini, Claude 3.5 Sonnet, o1 mini, and o1. We successfully translated 478 out of 488 theorems. The dataset is opensource: https://github.com/LLM4Rocq/miniF2F-rocq.

View on arXiv PDF Code

Similar