AIJul 26, 2024
Artificial Neural Networks on Graded Vector SpacesTony Shaska
This paper presents a transformative framework for artificial neural networks over graded vector spaces, tailored to model hierarchical and structured data in fields like algebraic geometry and physics. By exploiting the algebraic properties of graded vector spaces, where features carry distinct weights, we extend classical neural networks with graded neurons, layers, and activation functions that preserve structural integrity. Grounded in group actions, representation theory, and graded algebra, our approach combines theoretical rigor with practical utility. We introduce graded neural architectures, loss functions prioritizing graded components, and equivariant extensions adaptable to diverse gradings. Case studies validate the framework's effectiveness, outperforming standard neural networks in tasks such as predicting invariants in weighted projective spaces and modeling supersymmetric systems. This work establishes a new frontier in machine learning, merging mathematical sophistication with interdisciplinary applications. Future challenges, including computational scalability and finite field extensions, offer rich opportunities for advancing this paradigm.
LGJan 22, 2025
Galois groups of polynomials and neurosymbolic networksElira Shaska, Tony Shaska
This paper introduces a novel approach to understanding Galois theory, one of the foundational areas of algebra, through the lens of machine learning. By analyzing polynomial equations with machine learning techniques, we aim to streamline the process of determining solvability by radicals and explore broader applications within Galois theory. This summary encapsulates the background, methodology, potential applications, and challenges of using data science in Galois theory. More specifically, we design a neurosymbolic network to classify Galois groups and show how this is more efficient than usual neural networks. We discover some very interesting distribution of polynomials for groups not isomorphic to the symmetric groups and alternating groups.
LGFeb 25, 2025
Graded Neural NetworksTony Shaska
This paper presents a novel framework for graded neural networks (GNNs) built over graded vector spaces $\V_\w^n$, extending classical neural architectures by incorporating algebraic grading. Leveraging a coordinate-wise grading structure with scalar action $λ\star \x = (λ^{q_i} x_i)$, defined by a tuple $\w = (q_0, \ldots, q_{n-1})$, we introduce graded neurons, layers, activation functions, and loss functions that adapt to feature significance. Theoretical properties of graded spaces are established, followed by a comprehensive GNN design, addressing computational challenges like numerical stability and gradient scaling. Potential applications span machine learning and photonic systems, exemplified by high-speed laser-based implementations. This work offers a foundational step toward graded computation, unifying mathematical rigor with practical potential, with avenues for future empirical and hardware exploration.
LGJul 27, 2025
Graded TransformersTony Shaska
We introduce the Graded Transformer framework, a new class of sequence models that embeds algebraic inductive biases through grading transformations on vector spaces. Extending Graded Neural Networks (GNNs), we propose two architectures: the Linearly Graded Transformer (LGT) and the Exponentially Graded Transformer (EGT). These models apply parameterized scaling operators, governed by fixed or learnable grading tuples and in the case of EGT exponential factors, to encode hierarchical structure in attention and representation layers and to improve efficiency for structured data. We establish rigorous guarantees, including universal approximation theorems for continuous and Sobolev functions, reduced sample complexity via effective VC dimension bounds, Lipschitz continuity of graded operations, and robustness to perturbations. A graded loss ensures gradient stability and alignment with domain priors during optimization. By treating grades as differentiable parameters, the framework enables adaptive feature prioritization, overcoming limitations of fixed grades in earlier models. The Graded Transformer provides a mathematically principled approach to hierarchical learning and neuro-symbolic reasoning. Applications include algebraic geometry (moduli spaces and zeta functions), physics (multiscale systems), natural language processing (syntactic parsing), biological sequence analysis (variant prediction), robotics and autonomous systems (safety-critical prioritization), the automotive industry (certifiable AI for ADAS), and blockchain and financial cryptography (secure coding and structured prediction).
LGFeb 28, 2025
Neuro-Symbolic Learning for Galois Groups: Unveiling Probabilistic Trends in PolynomialsElira Shaska, Tony Shaska
This paper presents a neurosymbolic approach to classifying Galois groups of polynomials, integrating classical Galois theory with machine learning to address challenges in algebraic computation. By combining neural networks with symbolic reasoning we develop a model that outperforms purely numerical methods in accuracy and interpretability. Focusing on sextic polynomials with height $\leq 6$, we analyze a database of 53,972 irreducible examples, uncovering novel distributional trends, such as the 20 sextic polynomials with Galois group $C_6$ spanning just seven invariant-defined equivalence classes. These findings offer the first empirical insights into Galois group probabilities under height constraints and lay the groundwork for exploring solvability by radicals. Demonstrating AI's potential to reveal patterns beyond traditional symbolic techniques, this work paves the way for future research in computational algebra, with implications for probabilistic conjectures and higher degree classifications.
AIJan 26, 2025
A Neurosymbolic Framework for Geometric Reduction of Binary FormsIlias Kotsireas, Tony Shaska
This paper compares Julia reduction and hyperbolic reduction with the aim of finding equivalent binary forms with minimal coefficients. We demonstrate that hyperbolic reduction generally outperforms Julia reduction, particularly in the cases of sextics and decimics, though neither method guarantees achieving the minimal form. We further propose an additional shift and scaling to approximate the minimal form more closely. Finally, we introduce a machine learning framework to identify optimal transformations that minimize the heights of binary forms. This study provides new insights into the geometry and algebra of binary forms and highlights the potential of AI in advancing symbolic computation and reduction techniques. The findings, supported by extensive computational experiments, lay the groundwork for hybrid approaches that integrate traditional reduction methods with data-driven techniques.
LGNov 21, 2025
Internalizing Tools as Morphisms in Graded TransformersTony Shaska
We introduce a graded formulation of internal symbolic computation for transformers. The hidden space is endowed with a grading $V=\bigoplus_{g\in G}V_g$, and symbolic operations are realized as typed block maps (morphisms) $φ_{h\leftarrow g}:V_g\to V_h$ that are activated selectively by a differentiable routing policy. A self-supervised \emph{graded utility functional}, defined as the loss reduction induced by a candidate morphism, governs activation and yields sparse, interpretable behavior. We develop the algebraic and geometric foundations: an internal model category whose objects are homogeneous components and whose morphisms are admissible grade transitions; adjoint pairs encoding typed round trips; and information-geometric interpretations in terms of KL gain, mirror descent with Bregman divergences, and Fisher natural gradients. Methodologically, we specify a utility--aware routing mechanism and objective that remain fully end-to-end differentiable. Analytic case studies and lightweight sanity checks illustrate selective morphic activation on hybrid symbolic-linguistic tasks. The framework unifies symbolic computation, geometry, and self--supervised learning within the \emph{graded transformer} formalism \cite{sh-89,sh-95}, while subsuming prior external-tool paradigms (e.g., Toolformer \cite{toolformer2023}) as a special case via functorial internalization.