CLJan 30

Monotonic Reference-Free Refinement for Autoformalization

arXiv:2601.23166v1h-index: 14
Originality Highly original
AI Analysis

This addresses the challenge of autoformalizing entire theorems rather than isolated statements, which is critical for advancing automated theorem proving and formal verification.

The paper tackles the problem of full-theorem autoformalization by introducing a reference-free iterative monotonic process that jointly optimizes multiple quality dimensions, achieving 93.44% formal validity and 78.22% overall score on miniF2F, and 44.09% formal validity and 29.79% overall score on ProofNet.

While statement autoformalization has advanced rapidly, full-theorem autoformalization remains largely unexplored. Existing iterative refinement methods in statement autoformalization typicall improve isolated aspects of formalization, such as syntactic correctness, but struggle to jointly optimizing multiple quality dimensions, which is critical for full-theorem autoformalization. We introduce a reference-free iterative monotonic process for full-theorem autoformalization that leverages complementary feedback from theorem provers and LLM-based judges, without access to ground-truth proofs or existing formalizations at inference time. Our approach optimizes a masked composite objective over Formal Validity, Logical Preservation, Mathematical Consistency, and Formal Quality, guided by a responsiveness map that indicates how different LLMs acting as different roles preferentially improve each dimension. We further propose an acceptance policy that guarantees certified monotonic improvement, and provide conditions ensuring convergence and termination. Empirical experiments demonstrate the proposed process enables simultaneous improvement across multiple dimensions, achieving 93.44% formal validity and a 78.22% overall score on miniF2F, and 44.09% formal validity and a 29.79% overall score on ProofNet.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes