59.6DMApr 23
Irreducible Markov Chains on spaces of graphs with fixed degree-color sequencesFélix Almendra-Hernández, Jesús A. De Loera, Sonja Petrović
We study a colored generalization of the famous simple-switch Markov chain for sampling the set of graphs with a fixed degree sequence. Here we consider the space of graphs with colored vertices, in which we fix the degree sequence and another statistic arising from the vertex coloring, and prove that the set can be connected with simple color-preserving switches or moves. These moves form a basis for defining an irreducible Markov chain necessary for testing statistical model fit to block-partitioned network data. Our methods further generalize well-known algebraic results from the 1990s: namely, that the corresponding moves can be used to construct a regular triangulation for a generalization of the second hypersimplex. On the other hand, in contrast to the monochromatic case, we show that for \emph{simple} graphs, the 1-norm of the moves necessary to connect the space increases with the number of colors.
ACFeb 10, 2023
Predicting the cardinality and maximum degree of a reduced Gröbner basisShahrzad Jamshidi, Eric Kang, Sonja Petrović
We construct neural network regression models to predict key metrics of complexity for Gröbner bases of binomial ideals. This work illustrates why predictions with neural networks from Gröbner computations are not a straightforward process. Using two probabilistic models for random binomial ideals, we generate and make available a large data set that is able to capture sufficient variability in Gröbner complexity. We use this data to train neural networks and predict the cardinality of a reduced Gröbner basis and the maximum total degree of its elements. While the cardinality prediction problem is unlike classical problems tackled by machine learning, our simulations show that neural networks, providing performance statistics such as $r^2 = 0.401$, outperform naive guess or multiple regression models with $r^2 = 0.180$.
44.6SCMay 5
Asymptotic properties of random monomial idealsFatemeh Mohammadi, Sonja Petrović, Eduardo Sáenz-de-Cabezón
This paper focuses on asymptotic properties of random monomial ideals through a statistical viewpoint. It extends the study of redundancy in monomial ideals by analyzing the poset density of the LCM-lattice. We explore how this density behaves across random algebraic models and structured networks. Experimental data reveal that the LCM-lattice exhibits sharp threshold behavior rather than changing smoothly. We observe a strong negative correlation between the number of generators and LCM-lattice density, abruptly separating three distinct regimes: a low-density Taylor-like regime, a high-density redundant regime, and a narrow transition window. We show that increasing the generator degree causes this density drop to occur at lower probability thresholds. We conclude by conjecturing that for equigenerated squarefree ideals, the LCM-lattice density undergoes a sharp phase transition, analogous to the emergence of giant components in hypergraphs. This suggests that the classical, ideal-by-ideal role of the LCM-lattice as a combinatorial invariant also admits a statistical/asymptotic counterpart: in natural random families, redundancy and resolution-complexity indicators concentrate into distinct typical regimes separated by a narrow transition window.
MLMay 22, 2024
Learning to sample fibers for goodness-of-fit testingIvan Gvozdanović, Sonja Petrović
We consider the problem of constructing exact goodness-of-fit tests for discrete exponential family models. This classical problem remains practically unsolved for many types of structured or sparse data, as it rests on a computationally difficult core task: to produce a reliable sample from lattice points in a high-dimensional polytope. We translate the problem into a Markov decision process and demonstrate a reinforcement learning approach for learning `good moves' for sampling. We illustrate the approach on data sets and models for which traditional MCMC samplers converge too slowly due to problem size, sparsity structure, and the requirement to use prohibitive non-linear algebra computations in the process. The differentiating factor is the use of scalable tools from \emph{linear} algebra in the context of theoretical guarantees provided by \emph{non-linear} algebra. Our algorithm is based on an actor-critic sampling scheme, with provable convergence. The discovered moves can be used to efficiently obtain an exchangeable sample, significantly cutting computational times with regards to statistical testing.
ACJun 7, 2021
Learning a performance metric of Buchberger's algorithmJelena Mojsilović, Dylan Peifer, Sonja Petrović
What can be (machine) learned about the complexity of Buchberger's algorithm? Given a system of polynomials, Buchberger's algorithm computes a Gröbner basis of the ideal these polynomials generate using an iterative procedure based on multivariate long division. The runtime of each step of the algorithm is typically dominated by a series of polynomial additions, and the total number of these additions is a hardware independent performance metric that is often used to evaluate and optimize various implementation choices. In this work we attempt to predict, using just the starting input, the number of polynomial additions that take place during one run of Buchberger's algorithm. Good predictions are useful for quickly estimating difficulty and understanding what features make Gröbner basis computation hard. Our features and methods could also be used for value models in the reinforcement learning approach to optimize Buchberger's algorithm introduced in [Peifer, Stillman, and Halpern-Leistner, 2020]. We show that a multiple linear regression model built from a set of easy-to-compute ideal generator statistics can predict the number of polynomial additions somewhat well, better than an uninformed model, and better than regression models built on some intuitive commutative algebra invariants that are more difficult to compute. We also train a simple recursive neural network that outperforms these linear models. Our work serves as a proof of concept, demonstrating that predicting the number of polynomial additions in Buchberger's algorithm is a feasible problem from the point of view of machine learning.