Zoltán Kovács

AI
h-index4
14papers
43citations
Novelty17%
AI Score33

14 Papers

97.8CLMay 9
Soohak: A Mathematician-Curated Benchmark for Evaluating Research-level Math Capabilities of LLMs

Guijin Son, Seungone Kim, Catherine Arnett et al.

Following the recent achievement of gold-medal performance on the IMO by frontier LLMs, the community is searching for the next meaningful and challenging target for measuring LLM reasoning. Whereas olympiad-style problems measure step-by-step reasoning alone, research-level problems use such reasoning to advance the frontier of mathematical knowledge itself, emerging as a compelling alternative. Yet research-level math benchmarks remain scarce because such problems are difficult to source (e.g., Riemann Bench and FrontierMath-Tier 4 contain 25 and 50 problems, respectively). To support reliable evaluation of next-generation frontier models, we introduce Soohak, a 439-problem benchmark newly authored from scratch by 64 mathematicians. Soohak comprises two subsets. On the Challenge subset, frontier models including Gemini-3-Pro, GPT-5, and Claude-Opus-4.5 reach 30.4%, 26.4%, and 10.4% respectively, leaving substantial headroom, while leading open-weight models such as Qwen3-235B, GPT-OSS-120B, and Kimi-2.5 remain below 15%. Notably, beyond standard problem solving, Soohak introduces a refusal subset that probes a capability intrinsic to research mathematics: recognizing ill-posed problems and pausing rather than producing confident but unjustified answers. On this subset, no model exceeds 50%, identifying refusal as a new optimization target that current models do not directly address. To prevent contamination, the dataset will be publicly released in late 2026, with model evaluations available upon request in the interim.

SCJan 22, 2024
Solving with GeoGebra Discovery an Austrian Mathematics Olympiad problem: Lessons Learned

Belén Ariño-Morera, Zoltán Kovács, Tomás Recio et al.

We address, through the automated reasoning tools in GeoGebra Discovery, a problem from a regional phase of the Austrian Mathematics Olympiad 2023. Trying to solve this problem gives rise to four different kind of feedback: the almost instantaneous, automated solution of the proposed problem; the measure of its complexity, according to some recent proposals; the automated discovery of a generalization of the given assertion, showing that the same statement is true over more general polygons than those mentioned in the problem; and the difficulties associated to the analysis of the surprising and involved high number of degenerate cases that appear when using the LocusEquation command in this problem. In our communication we will describe and reflect on these diverse issues, enhancing its exemplar role for showing some of the advantages, problems, and current fields of development of GeoGebra Discovery.

SCJan 22, 2024
Showing Proofs, Assessing Difficulty with GeoGebra Discovery

Zoltán Kovács, Tomás Recio, M. Pilar Vélez

In our contribution we describe some on-going improvements concerning the Automated Reasoning Tools developed in GeoGebra Discovery, providing different examples of the performance of these new features. We describe the new ShowProof command, that outputs both the sequence of the different steps performed by GeoGebra Discovery to confirm a certain statement, as well as a number intending to grade the difficulty or interest of the assertion. The proposal of this assessment measure, involving the comparison of the expression of the thesis (or conclusion) as a combination of the hypotheses, will be developed.

CGNov 18, 2025
Automated proving in planar geometry based on the complex number identity method and elimination

Zoltán Kovács, Xicheng Peng

We improve the complex number identity proving method to a fully automated procedure, based on elimination ideals. By using declarative equations or rewriting each real-relational hypothesis $h_i$ to $h_i-r_i$, and the thesis $t$ to $t-r$, clearing the denominators and introducing an extra expression with a slack variable, we eliminate all free and relational point variables. From the obtained ideal $I$ in $\mathbb{Q}[r,r_1,r_2,\ldots]$ we can find a conclusive result. It plays an important role that if $r_1,r_2,\ldots$ are real, $r$ must also be real if there is a linear polynomial $p(r)\in I$, unless division by zero occurs when expressing $r$. Our results are presented in Mathematica, Maple and in a new version of the Giac computer algebra system. Finally, we present a prototype of the automated procedure in an experimental version of the dynamic geometry software GeoGebra.

CYJan 22, 2024
Using Java Geometry Expert as Guide in the Preparations for Math Contests

Ines Ganglmayr, Zoltán Kovács

We give an insight into Java Geometry Expert (JGEX) in use in a school context, focusing on the Austrian school system. JGEX can offer great support in some classroom situations, especially for solving mathematical competition tasks. Also, we discuss some limitations of the program.

HOJan 22, 2024
Solving Some Geometry Problems of the Náboj 2023 Contest with Automated Deduction in GeoGebra Discovery

Amela Hota, Zoltán Kovács, Alexander Vujic

In this article, we solve some of the geometry problems of the Náboj 2023 competition with the help of a computer, using examples that the software tool GeoGebra Discovery can calculate. In each case, the calculation requires symbolic computations. We analyze the difficulty of feeding the problem into the machine and set further goals to make the problems of this type of contests even more tractable in the future.

LOJan 19, 2024
Proceedings 14th International Conference on Automated Deduction in Geometry

Pedro Quaresma, Zoltán Kovács

ADG is a forum to exchange ideas and views, to present research results and progress, and to demonstrate software tools at the intersection between geometry and automated deduction. The conference is held every two years. The previous editions of ADG were held in Hagenberg in 2021 (online, postponed from 2020 due to COVID-19), Nanning in 2018, Strasbourg in 2016, Coimbra in 2014, Edinburgh in 2012, Munich in 2010, Shanghai in 2008, Pontevedra in 2006, Gainesville in 2004, Hagenberg in 2002, Zurich in 2000, Beijing in 1998, and Toulouse in 1996. The 14th edition, ADG 2023, was held in Belgrade, Serbia, in September 20-22, 2023. This edition of ADG had an additional special focus topic, Deduction in Education. Invited Speakers: Julien Narboux, University of Strasbourg, France "Formalisation, arithmetization and automatisation of geometry"; Filip Marić, University of Belgrade, Serbia, "Automatization, formalization and visualization of hyperbolic geometry"; Zlatan Magajna, University of Ljubljana, Slovenia, "Workshop OK Geometry"

HCJan 3, 2022
GeoGebra Discovery in Context

Zoltán Kovács, Tomás Recio, M. Pilar Vélez

In our contribution we will reflect, through a collection of selected examples, on the potential impact of the GeoGebra Discovery application on different social and educational contexts.

AIDec 28, 2021
Proceedings of the 13th International Conference on Automated Deduction in Geometry

Predrag Janičić, Zoltán Kovács

Automated Deduction in Geometry (ADG) is a forum to exchange ideas and views, to present research results and progress, and to demonstrate software tools at the intersection between geometry and automated deduction. Relevant topics include (but are not limited to): polynomial algebra, invariant and coordinate-free methods; probabilistic, synthetic, and logic approaches, techniques for automated geometric reasoning from discrete mathematics, combinatorics, and numerics; interactive theorem proving in geometry; symbolic and numeric methods for geometric computation, geometric constraint solving, automated generation/reasoning and manipulation with diagrams; design and implementation of geometry software, automated theorem provers, special-purpose tools, experimental studies; applications of ADG in mechanics, geometric modelling, CAGD/CAD, computer vision, robotics and education. Traditionally, the ADG conference is held every two years. The previous editions of ADG were held in Nanning in 2018, Strasbourg in 2016, Coimbra in 2014, Edinburgh in 2012, Munich in 2010, Shanghai in 2008, Pontevedra in 2006, Gainesville in 2004, Hagenberg in 2002, Zurich in 2000, Beijing in 1998, and Toulouse in 1996. The 13th edition of ADG was supposed to be held in 2020 in Hagenberg, Austria, but due to the COVID-19 pandemic, it was postponed for 2021, and held online (still hosted by RISC Institute, Hagenberg, Austria), September 15-17, 2021 (https://www.risc.jku.at/conferences/adg2021).

AIJul 24, 2020
Towards Automated Discovery of Geometrical Theorems in GeoGebra

Zoltán Kovács, Jonathan H. Yu

We describe a prototype of a new experimental GeoGebra command and tool Discover that analyzes geometric figures for salient patterns, properties, and theorems. This tool is a basic implementation of automated discovery in elementary planar geometry. The paper focuses on the mathematical background of the implementation, as well as methods to avoid combinatorial explosion when storing the interesting properties of a geometric figure.

AIFeb 28, 2020
Towards a Geometry Automated Provers Competition

Nuno Baeta, Pedro Quaresma, Zoltán Kovács

The geometry automated theorem proving area distinguishes itself by a large number of specific methods and implementations, different approaches (synthetic, algebraic, semi-synthetic) and different goals and applications (from research in the area of artificial intelligence to applications in education). Apart from the usual measures of efficiency (e.g. CPU time), the possibility of visual and/or readable proofs is also an expected output against which the geometry automated theorem provers (GATP) should be measured. The implementation of a competition between GATP would allow to create a test bench for GATP developers to improve the existing ones and to propose new ones. It would also allow to establish a ranking for GATP that could be used by "clients" (e.g. developers of educational e-learning systems) to choose the best implementation for a given intended use.

AIFeb 16, 2018
Detecting truth, just on parts

Zoltán Kovács, Tomás Recio, M. Pilar Vélez

We introduce and discuss, through a computational algebraic geometry approach, the automatic reasoning handling of propositions that are simultaneously true and false over some relevant collections of instances. A rigorous, algorithmic criterion is presented for detecting such cases, and its performance is exemplified through the implementation of this test on the dynamic geometry program GeoGebra.

HOApr 27, 2017
No, This is not a Circle

Zoltán Kovács

A popular curve shown in introductory maths textbooks, seems like a circle. But it is actually a different curve. This paper discusses some elementary approaches to identify the geometric object, including novel technological means by using GeoGebra. We demonstrate two ways to refute the false impression, two suggestions to find a correct conjecture, and four ways to confirm the result by proving it rigorously. All of the discussed approaches can be introduced in classrooms at various levels from middle school to high school.

AIMar 3, 2016
GeoGebra Tools with Proof Capabilities

Zoltán Kovács, Csilla Sólyom-Gecse

We report about significant enhancements of the complex algebraic geometry theorem proving subsystem in GeoGebra for automated proofs in Euclidean geometry, concerning the extension of numerous GeoGebra tools with proof capabilities. As a result, a number of elementary theorems can be proven by using GeoGebra's intuitive user interface on various computer architectures including native Java and web based systems with JavaScript. We also provide a test suite for benchmarking our results with 200 test cases.