On Multi-Step Theorem Prediction via Non-Parametric Structural Priors
This work provides a method for improving the generalization and scalability of multi-step theorem prediction for automated reasoning systems, particularly for evolving theorem libraries, by addressing the 'Structural Drift' bottleneck in in-context learning.
This paper addresses the challenge of multi-step theorem prediction, where existing neural-symbolic methods struggle with generalization. The authors propose a training-free in-context learning approach that uses Theorem Precedence Graphs to encode temporal dependencies and prune the search space, achieving 89.29% accuracy on the FormalGeo7k benchmark, outperforming ICL baselines and matching state-of-the-art supervised models.
Multi-step theorem prediction is a central challenge in automated reasoning. Existing neural-symbolic approaches rely heavily on supervised parametric models, which exhibit limited generalization to evolving theorem libraries. In this work, we explore training-free theorem prediction through the lens of in-context learning (ICL). We identify a critical scalability bottleneck, termed Structural Drift: as reasoning depth increases, the performance of vanilla ICL degrades sharply, often collapsing to near zero. We attribute this failure to the LLM's inability to recover latent topological dependencies, leading to unstructured exploration. To address this issue, we propose Theorem Precedence Graphs, which encode temporal dependencies from historical solution traces as directed graphs, and impose explicit topological constraints that effectively prune the search space during inference. Coupled with retrieval-augmented graph construction and a stepwise symbolic executor, our approach enables LLMs to act as structured planners without any gradient-based optimization. Experiments on the FormalGeo7k benchmark show that our method achieves 89.29% accuracy, substantially outperforming ICL baselines and matching state-of-the-art supervised models. These results indicate that explicit structural priors offer a promising direction for scaling LLM-based symbolic reasoning.