77.8LOMay 22
The complexity of Presburger arithmetic with power or powersMichael Benedikt, Dmitry Chistikov, Alessio Mansutti
We investigate expansions of Presburger arithmetic, i.e., the theory of the integers with addition and order, with additional structure related to exponentiation: either a function that takes a number to the power of $2$, or a predicate for the powers of $2$. The latter theory, denoted $\mathrm{PresPower}$, was introduced by Büchi as a first attempt at characterizing the sets of tuples of numbers that can be expressed using finite automata; Büchi's method does not give an elementary upper bound, and the complexity of this theory has been open. The former theory, denoted as $\mathrm{PresExp}$, was shown decidable by Semenov; while the decision procedure for this theory differs radically from the automata-based method proposed by Büchi, Semenov's method is also non-elementary. And in fact, the theory with the power function has a non-elementary lower bound. In this paper, we show that while Semenov's and Büchi's approaches yield non-elementary blow-ups for $\mathrm{PresPower}$, the theory is in fact decidable in triply exponential time, similarly to the best known quantifier-elimination algorithm for Presburger arithmetic. We also provide a $\mathrm{NExpTime}$ upper bound for the existential fragment of $\mathrm{PresExp}$, a step towards a finer-grained analysis of its complexity. Both these results are established by analyzing a single parameterized satisfiability algorithm for $\mathrm{PresExp}$, which can be specialized to either the setting of $\mathrm{PresPower}$ or the existential theory of $\mathrm{PresExp}$. Besides the new upper bounds for the existential theory of $\mathrm{PresExp}$ and $\mathrm{PresPower}$, we believe our algorithm provides new intuition for the decidability of these theories, and for the features that lead to non-elementary blow-ups.
LGJun 10, 2023
Learning a Neuron by a Shallow ReLU Network: Dynamics and Implicit Bias for Correlated InputsDmitry Chistikov, Matthias Englert, Ranko Lazic
We prove that, for the fundamental regression task of learning a single neuron, training a one-hidden layer ReLU network of any width by gradient flow from a small initialisation converges to zero loss and is implicitly biased to minimise the rank of network parameters. By assuming that the training points are correlated with the teacher neuron, we complement previous work that considered orthogonal datasets. Our results are based on a detailed non-asymptotic analysis of the dynamics of each hidden neuron throughout the training. We also show and characterise a surprising distinction in this setting between interpolator networks of minimal rank and those of minimal Euclidean norm. Finally we perform a range of numerical experiments, which corroborate our theoretical findings.
77.2FLMay 19
Intersecting Dense AutomataDmitry Chistikov, Neha Rino
We observe that the classical Cartesian product construction for the intersection of (languages of) nondeterministic finite automata (NFA) is non-optimal in the worst case, if the automata have many transitions. For a fixed alphabet, the product of two NFA may have $Θ(m^2)$ transitions if these NFA have at most $n$ states and $m$ transitions each. We describe alternative constructions with $O(m n)$ transitions: or $O(m n^{k-1})$ for the intersection of $k$ NFA (for fixed $k \ge 2$ and alphabet $Σ$). This gives a faster algorithm for deciding NFA intersection emptiness. The new algorithm is optimal, unless there exists a breakthrough combinatorial algorithm for detecting $(k+1)$-cliques in undirected graphs. This also leads to a more efficient certification scheme for NFA intersection emptiness.
32.3SIMay 14
On the Limits of PAC Learning of Networks from Opinion DynamicsDmitry Chistikov, Luisa Estrada, Mike Paterson et al.
Agents in social networks with threshold-based dynamics change opinions when influenced by sufficiently many peers. Existing literature typically assumes that the network structure and dynamics are fully known, which is often unrealistic. In this work, we ask how to learn a network structure from samples of the agents' synchronous opinion updates. Firstly, if the opinion dynamics follow a threshold rule in which a fixed number of influencers prevent opinion change (e.g., unanimity and quasi-unanimity), we provide an efficient PAC learning algorithm provided that the number of influencers per agent is bounded. Secondly, under standard computational complexity assumptions, we prove that if agents' opinions follow the majority of their influencers, then there is no efficient PAC learning algorithm. We propose a polynomial-time heuristic that successfully learns consistent networks in over $98\%$ of our simulations on random graphs, with no failures for some specified conditions on the numbers of agents and opinion diffusion examples.
FLMay 23, 2016
On Restricted Nonnegative Matrix FactorizationDmitry Chistikov, Stefan Kiefer, Ines Marušić et al.
Nonnegative matrix factorization (NMF) is the problem of decomposing a given nonnegative $n \times m$ matrix $M$ into a product of a nonnegative $n \times d$ matrix $W$ and a nonnegative $d \times m$ matrix $H$. Restricted NMF requires in addition that the column spaces of $M$ and $W$ coincide. Finding the minimal inner dimension $d$ is known to be NP-hard, both for NMF and restricted NMF. We show that restricted NMF is closely related to a question about the nature of minimal probabilistic automata, posed by Paz in his seminal 1971 textbook. We use this connection to answer Paz's question negatively, thus falsifying a positive answer claimed in 1974. Furthermore, we investigate whether a rational matrix $M$ always has a restricted NMF of minimal inner dimension whose factors $W$ and $H$ are also rational. We show that this holds for matrices $M$ of rank at most $3$ and we exhibit a rank-$4$ matrix for which $W$ and $H$ require irrational entries.
CCMay 22, 2016
Nonnegative Matrix Factorization Requires IrrationalityDmitry Chistikov, Stefan Kiefer, Ines Marušić et al.
Nonnegative matrix factorization (NMF) is the problem of decomposing a given nonnegative $n \times m$ matrix $M$ into a product of a nonnegative $n \times d$ matrix $W$ and a nonnegative $d \times m$ matrix $H$. A longstanding open question, posed by Cohen and Rothblum in 1993, is whether a rational matrix $M$ always has an NMF of minimal inner dimension $d$ whose factors $W$ and $H$ are also rational. We answer this question negatively, by exhibiting a matrix for which $W$ and $H$ require irrational entries.
DMFeb 11, 2016
Hitting Families of Schedules for Asynchronous ProgramsDmitry Chistikov, Rupak Majumdar, Filip Niksic
We consider the following basic task in the testing of concurrent systems. The input to the task is a partial order of events, which models actions performed on or by the system and specifies ordering constraints between them. The task is to determine if some scheduling of these events can result in a bug. The number of schedules to be explored can, in general, be exponential. Empirically, many bugs in concurrent programs have been observed to have small bug depth; that is, these bugs are exposed by every schedule that orders $d$ specific events in a particular way, irrespective of how the other events are ordered, and $d$ is small compared to the total number of events. To find all bugs of depth $d$, one needs to only test a $d$-hitting family of schedules: we call a set of schedules a $d$-hitting family if for each set of $d$ events, and for each allowed ordering of these events, there is some schedule in the family that executes these events in this ordering. The size of a $d$-hitting family may be much smaller than the number of all possible schedules, and a natural question is whether one can find $d$-hitting families of schedules that have small size. In general, finding the size of optimal $d$-hitting families is hard, even for $d=2$. We show, however, that when the partial order is a tree, one can explicitly construct $d$-hitting families of schedules of small size. When the tree is balanced, our constructions are polylogarithmic in the number of events.
LONov 3, 2014
Approximate Counting in SMT and Value Estimation for Probabilistic ProgramsDmitry Chistikov, Rayna Dimitrova, Rupak Majumdar
#SMT, or model counting for logical theories, is a well-known hard problem that generalizes such tasks as counting the number of satisfying assignments to a Boolean formula and computing the volume of a polytope. In the realm of satisfiability modulo theories (SMT) there is a growing need for model counting solvers, coming from several application domains (quantitative information flow, static analysis of probabilistic programs). In this paper, we show a reduction from an approximate version of #SMT to SMT. We focus on the theories of integer arithmetic and linear real arithmetic. We propose model counting algorithms that provide approximate solutions with formal bounds on the approximation error. They run in polynomial time and make a polynomial number of queries to the SMT solver for the underlying theory, exploiting "for free" the sophisticated heuristics implemented within modern SMT solvers. We have implemented the algorithms and used them to solve the value problem for a model of loop-free probabilistic programs with nondeterminism.