Jujian Zhang

LO
h-index10
4papers
22citations
Novelty23%
AI Score42

4 Papers

AIMay 6, 2025Code
CombiBench: Benchmarking LLM Capability for Combinatorial Mathematics

Junqi Liu, Xiaohan Lin, Jonas Bayer et al.

Neurosymbolic approaches integrating large language models with formal reasoning have recently achieved human-level performance on mathematics competition problems in algebra, geometry and number theory. In comparison, combinatorics remains a challenging domain, characterized by a lack of appropriate benchmarks and theorem libraries. To address this gap, we introduce CombiBench, a comprehensive benchmark comprising 100 combinatorial problems, each formalized in Lean~4 and paired with its corresponding informal statement. The problem set covers a wide spectrum of difficulty levels, ranging from middle school to IMO and university level, and span over ten combinatorial topics. CombiBench is suitable for testing IMO solving capabilities since it includes all IMO combinatorial problems since 2000 (except IMO 2004 P3 as its statement contain an images). Furthermore, we provide a comprehensive and standardized evaluation framework, dubbed Fine-Eval (for $\textbf{F}$ill-in-the-blank $\textbf{in}$ L$\textbf{e}$an Evaluation), for formal mathematics. It accommodates not only proof-based problems but also, for the first time, the evaluation of fill-in-the-blank questions. Using Fine-Eval as the evaluation method and Kimina Lean Server as the backend, we benchmark several LLMs on CombiBench and observe that their capabilities for formally solving combinatorial problems remain limited. Among all models tested (none of which has been trained for this particular task), Kimina-Prover attains the best results, solving 7 problems (out of 100) under both ``with solution'' and ``without solution'' scenarios. We open source the benchmark dataset alongside with the code of the proposed evaluation method at https://github.com/MoonshotAI/CombiBench/.

88.7COApr 25
Formalizing $A_1^{(1)}$ Curve Neighborhoods in Lean 4

Yihe Huang, Sizhe Cui, Jiaqi Wang et al.

Combinatorial curve neighborhoods are somewhat foundational when setting up the quantum Schubert calculus for affine flag manifolds. In the specific case of type $A_1^{(1)}$, you can encode these neighborhoods entirely within the moment graph of the infinite dihedral group $D_\infty$. Building on the framework developed by Mihalcea and Norton, this paper presents a complete, axiom-free formalization of these combinatorial curve neighborhoods in Lean 4. Rather than just wrapping mathematical statements, we formalized $D_\infty$ directly as a Coxeter system to explicitly compute length functions and degree maps. Reachable sets are defined through edge chains bounded by specific degrees, and we ultimately characterize the curve neighborhood by the maximal vertices inside these sets. The core effort here lies in formally verifying the explicit combinatorial formulas for curve neighborhoods of arbitrary elements. Interestingly, by restricting our search space to finite sets, we also managed to extract a fully computable version of these neighborhoods.