AINov 15, 2025

Improving Autoformalization Using Direct Dependency Retrieval

arXiv:2511.11990v12 citationsh-index: 13
Originality Highly original
AI Analysis

This work addresses the problem of poor precision and recall in formal library dependency retrieval for researchers in formal verification, representing a strong specific gain rather than a foundational advancement.

The paper tackles the challenge of statement autoformalization by proposing a Direct Dependency Retrieval (DDR) method to improve contextual awareness and reduce hallucinations, resulting in significant outperformance over state-of-the-art methods in retrieval precision and recall, and consistent advantages in autoformalization accuracy and stability.

The convergence of deep learning and formal mathematics has spurred research in formal verification. Statement autoformalization, a crucial first step in this process, aims to translate informal descriptions into machine-verifiable representations but remains a significant challenge. The core difficulty lies in the fact that existing methods often suffer from a lack of contextual awareness, leading to hallucination of formal definitions and theorems. Furthermore, current retrieval-augmented approaches exhibit poor precision and recall for formal library dependency retrieval, and lack the scalability to effectively leverage ever-growing public datasets. To bridge this gap, we propose a novel retrieval-augmented framework based on DDR (\textit{Direct Dependency Retrieval}) for statement autoformalization. Our DDR method directly generates candidate library dependencies from natural language mathematical descriptions and subsequently verifies their existence within the formal library via an efficient suffix array check. Leveraging this efficient search mechanism, we constructed a dependency retrieval dataset of over 500,000 samples and fine-tuned a high-precision DDR model. Experimental results demonstrate that our DDR model significantly outperforms SOTA methods in both retrieval precision and recall. Consequently, an autoformalizer equipped with DDR shows consistent performance advantages in both single-attempt accuracy and multi-attempt stability compared to models using traditional selection-based RAG methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes