75.0LGMay 28
Discovering a Zeta Map Algorithm on Dyck Paths via Mechanistic InterpretabilityXiaoyu Huang, Blake Jackson, Kyu-Hwan Lee
Machine learning is increasingly used in mathematical discovery, but in mathematics the desired output is often not a prediction itself, but an explicit construction that can be checked independently. We study this setting through the zeta map on Dyck paths, a classical bijection in the combinatorics of the q,t-Catalan numbers. We train a deliberately small one-layer, one-head encoder-decoder transformer on this map and analyze its learned computation using mechanistic interpretability tools, including decoder cross-attention analysis, linear probing, and causal intervention. The analysis reveals a level-based mechanism: encoder representations make path levels linearly accessible, while the decoder selects and traverses input positions in a structured way. Translating these signals into combinatorics leads to the scaffolding map, an explicit peak-centered traversal algorithm for Dyck paths. We prove that this algorithm agrees with the zeta map, modulo a reversal convention in the labeling. This gives a controlled example of AI-assisted mathematical discovery in which mechanistic interpretability turns model behavior into a precise, human-verifiable combinatorial algorithm.
NTSep 19, 2022
Machine Learning Class Numbers of Real Quadratic FieldsMalik Amir, Yang-Hui He, Kyu-Hwan Lee et al.
We implement and interpret various supervised learning experiments involving real quadratic fields with class numbers 1, 2 and 3. We quantify the relative difficulties in separating class numbers of matching/different parity from a data-scientific perspective, apply the methodology of feature analysis and principal component analysis, and use symbolic classification to develop machine-learned formulas for class numbers 1, 2 and 3 that apply to our dataset.
70.3LOMay 20
Lean-GAP: A Dataset of Formalized Graduate Algebra ProblemsSeewoo Lee, Byung-Hak Hwang, Hyojae Lim et al.
We present Lean-GAP (Lean-Graduate Agebra Problems), 430 formalized graduate-level algebra problems from the textbook Abstract Algebra by Dummit and Foote. We develop a scalable pipeline consisting of PDF-to-LaTeX preprocessing, autoformalization into Lean 4, and verification of informal-formal correspondence. While the preprocessing and autoformalization stages can be largely automated, we find that verification remains the most subtle and labor-intensive component, requiring careful human oversight. Our contributions include (i) the construction of a structured dataset of formalized exercises, (ii) a systematic methodology for formalizing textbook mathematics, and (iii) an analysis of recurring challenges in the formalization process. We also compare the performance of different autoformalization models and highlight key bottlenecks in translating informal statements into formal language.
22.2NTMar 10
Murmurations: a case study in AI-assisted mathematicsYang-Hui He, Kyu-Hwan Lee, Thomas Oliver et al.
We report the emergence of a striking new phenomenon in arithmetic, which we call murmurations. First observed experimentally through averages over large arithmetic datasets, murmurations can be detected and analyzed using standard interpretability tools from machine learning, including principal component weightings, saliency curves, and convolutional filters. Although discovered computationally, they constitute a genuinely new and intriguing phenomenon in arithmetic that can be formulated and investigated using established tools of number theory. In particular, murmurations encode subtle information about Frobenius traces and naturally belong to the framework of arithmetic statistics. More precisely, murmurations connect to central themes surrounding the conjecture of Birch and Swinnerton-Dyer and perspectives from random matrix theory. In this paper, we present an overview of murmurations, contextualizing them within number theory and AI.
NTFeb 14, 2025
Learning Euler Factors of Elliptic CurvesAngelica Babei, François Charton, Edgar Costa et al.
We apply transformer models and feedforward neural networks to predict Frobenius traces $a_p$ from elliptic curves given other traces $a_q$. We train further models to predict $a_p \bmod 2$ from $a_q \bmod 2$, and cross-analysis such as $a_p \bmod 2$ from $a_q$. Our experiments reveal that these models achieve high accuracy, even in the absence of explicit number-theoretic tools like functional equations of $L$-functions. We also present partial interpretability findings.
CONov 6, 2024
Machine Learning Mutation-Acyclicity of QuiversKymani T. K. Armstrong-Williams, Edward Hirst, Blake Jackson et al.
Machine learning (ML) has emerged as a powerful tool in mathematical research in recent years. This paper applies ML techniques to the study of quivers -- a type of directed multigraph with significant relevance in algebra, combinatorics, computer science, and mathematical physics. Specifically, we focus on the challenging problem of determining the mutation-acyclicity of a quiver on 4 vertices, a property that is pivotal since mutation-acyclicity is often a necessary condition for theorems involving path algebras and cluster algebras. Although this classification is known for quivers with at most 3 vertices, little is known about quivers on more than 3 vertices. We give a computer-assisted proof of a theorem to prove that mutation-acyclicity is decidable for quivers on 4 vertices with edge weight at most 2. By leveraging neural networks (NNs) and support vector machines (SVMs), we then accurately classify more general 4-vertex quivers as mutation-acyclic or non-mutation-acyclic. Our results demonstrate that ML models can efficiently detect mutation-acyclicity, providing a promising computational approach to this combinatorial problem, from which the trained SVM equation provides a starting point to guide future theoretical development.
HOFeb 12, 2025
Mathematical Data ScienceMichael R. Douglas, Kyu-Hwan Lee
Can machine learning help discover new mathematical structures? In this article we discuss an approach to doing this which one can call "mathematical data science". In this paradigm, one studies mathematical objects collectively rather than individually, by creating datasets and doing machine learning experiments and interpretations. After an overview, we present two case studies: murmurations in number theory and loadings of partitions related to Kronecker coefficients in representation theory and combinatorics.
CONov 16, 2025
From Black Box to Bijection: Interpreting Machine Learning to Build a Zeta Map AlgorithmXiaoyu Huang, Blake Jackson, Kyu-Hwan Lee
There is a large class of problems in algebraic combinatorics which can be distilled into the same challenge: construct an explicit combinatorial bijection. Traditionally, researchers have solved challenges like these by visually inspecting the data for patterns, formulating conjectures, and then proving them. But what is to be done if patterns fail to emerge until the data grows beyond human scale? In this paper, we propose a new workflow for discovering combinatorial bijections via machine learning. As a proof of concept, we train a transformer on paired Dyck paths and use its learned attention patterns to derive a new algorithmic description of the zeta map, which we call the \textit{Scaffolding Map}.
NTAug 8, 2025
Machines Learn Number Fields, But How? The Case of Galois GroupsKyu-Hwan Lee, Seewoo Lee
By applying interpretable machine learning methods such as decision trees, we study how simple models can classify the Galois groups of Galois extensions over $\mathbb{Q}$ of degrees 4, 6, 8, 9, and 10, using Dedekind zeta coefficients. Our interpretation of the machine learning results allows us to understand how the distribution of zeta coefficients depends on the Galois group, and to prove new criteria for classifying the Galois groups of these extensions. Combined with previous results, this work provides another example of a new paradigm in mathematical research driven by machine learning.
LGFeb 17, 2025
Interpretable Machine Learning for Kronecker CoefficientsGiorgi Butbaia, Kyu-Hwan Lee, Fabian Ruehle
We analyze the saliency of neural networks and employ interpretable machine learning models to predict whether the Kronecker coefficients of the symmetric group are zero or not. Our models use triples of partitions as input features, as well as b-loadings derived from the principal component of an embedding that captures the differences between partitions. Across all approaches, we achieve an accuracy of approximately 83% and derive explicit formulas for a decision function in terms of b-loadings. Additionally, we develop transformer-based models for prediction, achieving the highest reported accuracy of over 99%.
NTJan 3, 2025
Learning Fricke signs from Maass form CoefficientsJoanna Bieri, Giorgi Butbaia, Edgar Costa et al.
In this paper, we conduct a data-scientific investigation of Maass forms. We find that averaging the Fourier coefficients of Maass forms with the same Fricke sign reveals patterns analogous to the recently discovered "murmuration" phenomenon, and that these patterns become more pronounced when parity is incorporated as an additional feature. Approximately 43% of the forms in our dataset have an unknown Fricke sign. For the remaining forms, we employ Linear Discriminant Analysis (LDA) to machine learn their Fricke sign, achieving 96% (resp. 94%) accuracy for forms with even (resp. odd) parity. We apply the trained LDA model to forms with unknown Fricke signs to make predictions. The average values based on the predicted Fricke signs are computed and compared to those for forms with known signs to verify the reasonableness of the predictions. Additionally, a subset of these predictions is evaluated against heuristic guesses provided by Hejhal's algorithm, showing a match approximately 95% of the time. We also use neural networks to obtain results comparable to those from the LDA model.
NTDec 7, 2020
Machine-Learning Arithmetic CurvesYang-Hui He, Kyu-Hwan Lee, Thomas Oliver
We show that standard machine-learning algorithms may be trained to predict certain invariants of low genus arithmetic curves. Using datasets of size around one hundred thousand, we demonstrate the utility of machine-learning in classification problems pertaining to the BSD invariants of an elliptic curve (including its rank and torsion subgroup), and the analogous invariants of a genus 2 curve. Our results show that a trained machine can efficiently classify curves according to these invariants with high accuracies (>0.97). For problems such as distinguishing between torsion orders, and the recognition of integral points, the accuracies can reach 0.998.
NTNov 17, 2020
Machine-Learning Number FieldsYang-Hui He, Kyu-Hwan Lee, Thomas Oliver
We show that standard machine-learning algorithms may be trained to predict certain invariants of algebraic number fields to high accuracy. A random-forest classifier that is trained on finitely many Dedekind zeta coefficients is able to distinguish between real quadratic fields with class number 1 and 2, to 0.96 precision. Furthermore, the classifier is able to extrapolate to fields with discriminant outside the range of the training data. When trained on the coefficients of defining polynomials for Galois extensions of degrees 2, 6, and 8, a logistic regression classifier can distinguish between Galois groups and predict the ranks of unit groups with precision >0.97.
NTOct 2, 2020
Machine-Learning the Sato--Tate ConjectureYang-Hui He, Kyu-Hwan Lee, Thomas Oliver
We apply some of the latest techniques from machine-learning to the arithmetic of hyperelliptic curves. More precisely we show that, with impressive accuracy and confidence (between 99 and 100 percent precision), and in very short time (matter of seconds on an ordinary laptop), a Bayesian classifier can distinguish between Sato-Tate groups given a small number of Euler factors for the L-function. Our observations are in keeping with the Sato-Tate conjecture for curves of low genus. For elliptic curves, this amounts to distinguishing generic curves (with Sato-Tate group SU(2)) from those with complex multiplication. In genus 2, a principal component analysis is observed to separate the generic Sato-Tate group USp(4) from the non-generic groups. Furthermore in this case, for which there are many more non-generic possibilities than in the case of elliptic curves, we demonstrate an accurate characterisation of several Sato-Tate groups with the same identity component. Throughout, our observations are verified using known results from the literature and the data available in the LMFDB. The results in this paper suggest that a machine can be trained to learn the Sato-Tate distributions and may be able to classify curves much more efficiently than the methods available in the literature.