NTSep 19, 2022
Machine Learning Class Numbers of Real Quadratic FieldsMalik Amir, Yang-Hui He, Kyu-Hwan Lee et al.
We implement and interpret various supervised learning experiments involving real quadratic fields with class numbers 1, 2 and 3. We quantify the relative difficulties in separating class numbers of matching/different parity from a data-scientific perspective, apply the methodology of feature analysis and principal component analysis, and use symbolic classification to develop machine-learned formulas for class numbers 1, 2 and 3 that apply to our dataset.
91.0NTMar 10
Murmurations: a case study in AI-assisted mathematicsYang-Hui He, Kyu-Hwan Lee, Thomas Oliver et al.
We report the emergence of a striking new phenomenon in arithmetic, which we call murmurations. First observed experimentally through averages over large arithmetic datasets, murmurations can be detected and analyzed using standard interpretability tools from machine learning, including principal component weightings, saliency curves, and convolutional filters. Although discovered computationally, they constitute a genuinely new and intriguing phenomenon in arithmetic that can be formulated and investigated using established tools of number theory. In particular, murmurations encode subtle information about Frobenius traces and naturally belong to the framework of arithmetic statistics. More precisely, murmurations connect to central themes surrounding the conjecture of Birch and Swinnerton-Dyer and perspectives from random matrix theory. In this paper, we present an overview of murmurations, contextualizing them within number theory and AI.
NTJan 3, 2025
Learning Fricke signs from Maass form CoefficientsJoanna Bieri, Giorgi Butbaia, Edgar Costa et al.
In this paper, we conduct a data-scientific investigation of Maass forms. We find that averaging the Fourier coefficients of Maass forms with the same Fricke sign reveals patterns analogous to the recently discovered "murmuration" phenomenon, and that these patterns become more pronounced when parity is incorporated as an additional feature. Approximately 43% of the forms in our dataset have an unknown Fricke sign. For the remaining forms, we employ Linear Discriminant Analysis (LDA) to machine learn their Fricke sign, achieving 96% (resp. 94%) accuracy for forms with even (resp. odd) parity. We apply the trained LDA model to forms with unknown Fricke signs to make predictions. The average values based on the predicted Fricke signs are computed and compared to those for forms with known signs to verify the reasonableness of the predictions. Additionally, a subset of these predictions is evaluated against heuristic guesses provided by Hejhal's algorithm, showing a match approximately 95% of the time. We also use neural networks to obtain results comparable to those from the LDA model.
NTDec 7, 2020
Machine-Learning Arithmetic CurvesYang-Hui He, Kyu-Hwan Lee, Thomas Oliver
We show that standard machine-learning algorithms may be trained to predict certain invariants of low genus arithmetic curves. Using datasets of size around one hundred thousand, we demonstrate the utility of machine-learning in classification problems pertaining to the BSD invariants of an elliptic curve (including its rank and torsion subgroup), and the analogous invariants of a genus 2 curve. Our results show that a trained machine can efficiently classify curves according to these invariants with high accuracies (>0.97). For problems such as distinguishing between torsion orders, and the recognition of integral points, the accuracies can reach 0.998.
NTNov 17, 2020
Machine-Learning Number FieldsYang-Hui He, Kyu-Hwan Lee, Thomas Oliver
We show that standard machine-learning algorithms may be trained to predict certain invariants of algebraic number fields to high accuracy. A random-forest classifier that is trained on finitely many Dedekind zeta coefficients is able to distinguish between real quadratic fields with class number 1 and 2, to 0.96 precision. Furthermore, the classifier is able to extrapolate to fields with discriminant outside the range of the training data. When trained on the coefficients of defining polynomials for Galois extensions of degrees 2, 6, and 8, a logistic regression classifier can distinguish between Galois groups and predict the ranks of unit groups with precision >0.97.
NTOct 2, 2020
Machine-Learning the Sato--Tate ConjectureYang-Hui He, Kyu-Hwan Lee, Thomas Oliver
We apply some of the latest techniques from machine-learning to the arithmetic of hyperelliptic curves. More precisely we show that, with impressive accuracy and confidence (between 99 and 100 percent precision), and in very short time (matter of seconds on an ordinary laptop), a Bayesian classifier can distinguish between Sato-Tate groups given a small number of Euler factors for the L-function. Our observations are in keeping with the Sato-Tate conjecture for curves of low genus. For elliptic curves, this amounts to distinguishing generic curves (with Sato-Tate group SU(2)) from those with complex multiplication. In genus 2, a principal component analysis is observed to separate the generic Sato-Tate group USp(4) from the non-generic groups. Furthermore in this case, for which there are many more non-generic possibilities than in the case of elliptic curves, we demonstrate an accurate characterisation of several Sato-Tate groups with the same identity component. Throughout, our observations are verified using known results from the literature and the data available in the LMFDB. The results in this paper suggest that a machine can be trained to learn the Sato-Tate distributions and may be able to classify curves much more efficiently than the methods available in the literature.