Gergely Bérczi

AG
h-index89
9papers
21citations
Novelty46%
AI Score51

9 Papers

87.2AGMay 27
Real-rootedness of the Poincaré polynomials of $\overline{\mathcal M}_{0,n}$: an AI-assisted proof

Gergely Bérczi, Young-Hoon Kiem

We prove real-rootedness for the Poincaré polynomial \[ P_n(t)=\sum_{i=0}^{n-3} \dim H^{2i}(\overline{\mathcal M}_{0,n};\mathbb{Q})t^i \] of the Deligne--Mumford moduli space $\overline{\mathcal M}_{0,n}$ of stable $n$-pointed rational curves, proving a conjecture of Aluffi--Chen--Marcolli. The proof starts from the Keel--Manin--Getzler recurrence, but its main new idea is a bivariate deformation $F_m(y,t)$ of the Poincaré polynomial. This deformation reveals a hidden interlacing structure not visible in the one-variable recurrence. For fixed $t<0$, the zero set of $F_m$ in the $y$-direction is controlled by a Sturm--Rolle argument on the interval $0<y<1-t$. The original polynomial is recovered on the slice $y=1$, and the ordered crossings of the moving roots through this slice give both real-rootedness and strict interlacing. Consequently, the Betti numbers of $\overline{\mathcal M}_{0,n}$ form an ultra-log-concave sequence. We further prove real-rootedness and ultra-log-concavity for the Poincaré polynomial of the Fulton--MacPherson space $\mathbb{P}^1[n]$ of $n$ ordered points in degenerations of the complex projective line. The proof for $\overline{\mathcal M}_{0,n}$ was obtained through an iterative AI-assisted workflow with Co-Mathematician, an agentic frontier-model system developed by Google DeepMind. The human role was to pose the problem, evaluate successive attempts, request repairs of gaps, compare the evolving argument with the literature, and assemble the final human-verifiable proof. Our additional human contribution was to observe that a similar residual deformation strategy applies to the Fulton--MacPherson spaces $\mathbb P^1[n]$, yielding the corresponding real-rootedness theorem.

83.8AGMay 24
Positivity in classical enumerative geometry: a case study in synchronized AI-assisted mathematics

Gergely Bérczi, László M. Fehér

We study the symmetric polynomial $\prod_{α\in A_{n,d}}\bigl(1+α_1 x_1+\cdots+α_n x_n\bigr)$ where $A_{n,d}:=\{α\in\mathbb{Z}_{\ge 0}^n:|α|=d\}$, which is the total Chern class of $\mathrm{Sym}^d(\mathbb{C}^n)$, viewed as a torus representation whose Chern roots are the weights $α_1 x_1+\cdots+α_n x_n$ for $α\in A_{n,d}$. Its homogeneous degree-$k$ part $c_k(n,d)$ is the $k$-th Chern class of $\mathrm{Sym}^d(\mathbb{C}^n)$. These Chern classes, together with their coefficients in various symmetric function bases, play a central role in enumerative geometry. Despite their simple definition, general closed formulas for their coefficients are subtle, and many structural properties of these classes have remained poorly understood. In this paper we prove several conjectures concerning their structure, establish explicit formulas, and study log-concavity properties for both the Chern classes and their $K$-theoretic analogue. In rank two, passing to the Schur basis and expanding the Schur coefficients in the binomial basis of $d$, we uncover a new binomial log-concavity phenomenon and prove refined positivity results. The paper demonstrates a novel methodology: we combine several AI systems with human mathematical insight in a coordinated workflow, deploying each tool according to its strengths in experimental discovery, conjecture formation, symbolic proof construction, and verification. To our knowledge, this is one of the first detailed case studies of orchestrating multiple AI tools to make substantial progress on a coherent mathematical research project.

89.8AIMay 21
Advancing Mathematics Research with AI-Driven Formal Proof Search

George Tsoukalas, Anton Kovsharov, Sergey Shirobokov et al.

Large language models (LLMs) increasingly excel at mathematical reasoning, but their unreliability limits their utility in mathematics research. A mitigation is using LLMs to generate formal proofs in languages like Lean. We perform the first large-scale evaluation of this method's ability to solve open problems. Our most capable agent autonomously resolved 9 of 353 open Erdős problems at the per-problem cost of a few hundred dollars, proved 44/492 OEIS conjectures, and is being deployed in combinatorics, optimization, graph theory, algebraic geometry, and quantum optics research. A basic agent alternating LLM-based generation with Lean-based verification replicated the Erdős successes but proved costlier on the hardest problems. These findings demonstrate the power of AI-aided formal proof search and shed light on the agent designs that enable it.

LGJul 1, 2023
An ML approach to resolution of singularities

Gergely Bérczi, Honglu Fan, Mingcong Zeng

The solution set of a system of polynomial equations typically contains ill-behaved, singular points. Resolution is a fundamental process in geometry in which we replace singular points with smooth points, while keeping the rest of the solution set unchanged. Resolutions are not unique: the usual way to describe them involves repeatedly performing a fundamental operation known as "blowing-up", and the complexity of the resolution highly depends on certain choices. The process can be translated into various versions of a 2-player game, the so-called Hironaka game, and a winning strategy for the first player provides a solution to the resolution problem. In this paper we introduce a new approach to the Hironaka game that uses reinforcement learning agents to find optimal resolutions of singularities. In certain domains, the trained model outperforms state-of-the-art selection heuristics in total number of polynomial additions performed, which provides a proof-of-concept that recent developments in machine learning have the potential to improve performance of algorithms in symbolic computation.

AGFeb 6
Evolving Ranking Functions for Canonical Blow-Ups in Positive Characteristic

Gergely Bérczi

Resolution of singularities in positive characteristic remains a long-standing open problem in algebraic geometry. In characteristic zero, the problem was solved by Hironaka in 1964, work for which he was awarded the Fields Medal. Modern proofs proceed by constructing suitable ranking functions, that is, invariants shown to strictly decrease along canonical sequences of blow-ups, ensuring termination. In positive characteristic, however, no such general ranking function is known: Frobenius-specific pathologies, such as the kangaroo phenomenon, can cause classical characteristic-zero invariants to plateau or even temporarily increase, presenting a fundamental obstruction to existing approaches. In this paper we report a sequence of experiments using the evolutionary search model AlphaEvolve, designed to discover candidate ranking functions for a toy canonical blow-up process. Our test benchmarks consist of carefully selected hypersurface singularities in dimension $4$ and characteristic $p=3$, with monic purely inseparable leading term, a regime in which naive order-based invariants often fail. After iteratively refining the experimental design, we obtained a discretized five-component lexicographic ranking function satisfying a bounded-delay descent criterion with zero violations across the benchmark. These experiments in turn motivated our main results: the conjectural delayed ranking functions in characteristic $3$ formulated in two conjectures.

CLSep 30, 2025
IMProofBench: Benchmarking AI on Research-Level Mathematical Proof Generation

Johannes Schmitt, Gergely Bérczi, Jasper Dekoninck et al.

As the mathematical capabilities of large language models (LLMs) improve, it becomes increasingly important to evaluate their performance on research-level tasks at the frontier of mathematical knowledge. However, existing benchmarks are limited, as they focus solely on final-answer questions or high-school competition problems. To address this gap, we introduce IMProofBench, a private benchmark consisting of 39 peer-reviewed problems developed by expert mathematicians. Each problem requires a detailed proof and is paired with subproblems that have final answers, supporting both an evaluation of mathematical reasoning capabilities by human experts and a large-scale quantitative analysis through automated grading. Furthermore, unlike prior benchmarks, the evaluation setup simulates a realistic research environment: models operate in an agentic framework with tools like web search for literature review and mathematical software such as SageMath. Our results show that current LLMs can succeed at the more accessible research-level questions, but still encounter significant difficulties on more challenging problems. Quantitatively, Grok-4 achieves the highest accuracy of 52% on final-answer subproblems, while GPT-5 obtains the best performance for proof generation, achieving a fully correct solution for 22% of problems. IMProofBench will continue to evolve as a dynamic benchmark in collaboration with the mathematical community, ensuring its relevance for evaluating the next generation of LLMs.

LGNov 29, 2024
A Note on Small Percolating Sets on Hypercubes via Generative AI

Gergely Bérczi, Adam Zsolt Wagner

We apply a generative AI pattern-recognition technique called PatternBoost to study bootstrap percolation on hypercubes. With this, we slightly improve the best existing upper bound for the size of percolating subsets of the hypercube.

COJan 25
Flow-based Extremal Mathematical Structure Discovery

Gergely Bérczi, Baran Hashemi, Jonas Klüver

The discovery of extremal structures in mathematics requires navigating vast and nonconvex landscapes where analytical methods offer little guidance and brute-force search becomes intractable. We introduce FlowBoost, a closed-loop generative framework that learns to discover rare and extremal geometric structures by combining three components: (i) a geometry-aware conditional flow-matching model that learns to sample high-quality configurations, (ii) reward-guided policy optimization with action exploration that directly optimizes the generation process toward the objective while maintaining diversity, and (iii) stochastic local search for both training-data generation and final refinement. Unlike prior open-loop approaches, such as PatternBoost that retrains on filtered discrete samples, or AlphaEvolve which relies on frozen Large Language Models (LLMs) as evolutionary mutation operators, FlowBoost enforces geometric feasibility during sampling, and propagates reward signal directly into the generative model, closing the optimization loop and requiring much smaller training sets and shorter training times, and reducing the required outer-loop iterations by orders of magnitude, while eliminating dependence on LLMs. We demonstrate the framework on four geometric optimization problems: sphere packing in hypercubes, circle packing maximizing sum of radii, the Heilbronn triangle problem, and star discrepancy minimization. In several cases, FlowBoost discovers configurations that match or exceed the best known results. For circle packings, we improve the best known lower bounds, surpassing the LLM-based system AlphaEvolve while using substantially fewer computational resources.

COOct 24, 2024
Reinforcement Learning the Chromatic Symmetric Function

Gergely Bérczi, Jonas Klüver

We propose a conjectural counting formula for the coefficients of the chromatic symmetric function of unit interval graphs using reinforcement learning. The formula counts specific disjoint cycle-tuples in the graphs, referred to as Eschers, which satisfy certain concatenation conditions. These conditions are identified by a reinforcement learning model and are independent of the particular unit interval graph, resulting a universal counting expression.