SYFeb 7, 2017
Graphical Models and Belief Propagation-hierarchy for Optimal Physics-Constrained Network FlowsMichael Chertkov, Sidhant Misra, Marc Vuffray et al.
In this manuscript we review new ideas and first results on application of the Graphical Models approach, originated from Statistical Physics, Information Theory, Computer Science and Machine Learning, to optimization problems of network flow type with additional constraints related to the physics of the flow. We illustrate the general concepts on a number of enabling examples from power system and natural gas transmission (continental scale) and distribution (district scale) systems.
QUANT-PHApr 8, 2023
Learning Energy-Based Representations of Quantum Many-Body StatesAbhijith Jayakumar, Marc Vuffray, Andrey Y. Lokhov
Efficient representation of quantum many-body states on classical computers is a problem of enormous practical interest. An ideal representation of a quantum state combines a succinct characterization informed by the system's structure and symmetries, along with the ability to predict the physical observables of interest. A number of machine learning approaches have been recently used to construct such classical representations [1-6] which enable predictions of observables [7] and account for physical symmetries [8]. However, the structure of a quantum state gets typically lost unless a specialized ansatz is employed based on prior knowledge of the system [9-12]. Moreover, most such approaches give no information about what states are easier to learn in comparison to others. Here, we propose a new generative energy-based representation of quantum many-body states derived from Gibbs distributions used for modeling the thermal states of classical spin systems. Based on the prior information on a family of quantum states, the energy function can be specified by a small number of parameters using an explicit low-degree polynomial or a generic parametric family such as neural nets, and can naturally include the known symmetries of the system. Our results show that such a representation can be efficiently learned from data using exact algorithms in a form that enables the prediction of expectation values of physical observables. Importantly, the structure of the learned energy function provides a natural explanation for the hardness of learning for a given class of quantum states.
LGFeb 12
Computationally sufficient statistics for Ising modelsAbhijith Jayakumar, Shreya Shukla, Marc Vuffray et al.
Learning Gibbs distributions using only sufficient statistics has long been recognized as a computationally hard problem. On the other hand, computationally efficient algorithms for learning Gibbs distributions rely on access to full sample configurations generated from the model. For many systems of interest that arise in physical contexts, expecting a full sample to be observed is not practical, and hence it is important to look for computationally efficient methods that solve the learning problem with access to only a limited set of statistics. We examine the trade-offs between the power of computation and observation within this scenario, employing the Ising model as a paradigmatic example. We demonstrate that it is feasible to reconstruct the model parameters for a model with $\ell_1$ width $γ$ by observing statistics up to an order of $O(γ)$. This approach allows us to infer the model's structure and also learn its couplings and magnetic fields. We also discuss a setting where prior information about structure of the model is available and show that the learning problem can be solved efficiently with even more limited observational power.
MLFeb 23
Selecting Optimal Variable Order in Autoregressive Ising ModelsShiba Biswal, Marc Vuffray, Andrey Y. Lokhov
Autoregressive models enable tractable sampling from learned probability distributions, but their performance critically depends on the variable ordering used in the factorization via complexities of the resulting conditional distributions. We propose to learn the Markov random field describing the underlying data, and use the inferred graphical model structure to construct optimized variable orderings. We illustrate our approach on two-dimensional image-like models where a structure-aware ordering leads to restricted conditioning sets, thereby reducing model complexity. Numerical experiments on Ising models with discrete data demonstrate that graph-informed orderings yield higher-fidelity generated samples compared to naive variable orderings.
LGMay 13
Finite Sample Bounds for Learning with Score MatchingDevin Smedira, Abhijith Jayakumar, Sidhant Misra et al.
Learning of continuous exponential family distributions with unbounded support remains an important area of research for both theory and applications in high-dimensional statistics. In recent years, score matching has become a widely used method for learning exponential families with continuous variables due to its computational ease when compared against maximum likelihood estimation. However, theoretical understanding of the statistical properties of score matching is still lacking. In this work, we provide a non-asymptotic sample complexity analysis for learning the structure of exponential families of polynomials with score matching. The derived sample bounds show a polynomial dependence on the model dimension. These bounds are the first of its kind, as all prior work has shown only asymptotic bounds on the sample complexity.
MLOct 17, 2024
Discrete distributions are learnable from metastable samplesAbhijith Jayakumar, Andrey Y. Lokhov, Sidhant Misra et al.
Physically motivated stochastic dynamics are often used to sample from high-dimensional distributions. However such dynamics often get stuck in specific regions of their state space and mix very slowly to the desired stationary state. This causes such systems to approximately sample from a metastable distribution which is usually quite different from the desired, stationary distribution of the dynamic. We rigorously show that, in the case of multi-variable discrete distributions, the true model describing the stationary distribution can be recovered from samples produced from a metastable distribution under minimal assumptions about the system. This follows from a fundamental observation that the single-variable conditionals of metastable distributions that satisfy a strong metastability condition are on average close to those of the stationary distribution. This holds even when the metastable distribution differs considerably from the true model in terms of global metrics like Kullback-Leibler divergence or total variation distance. This property allows us to learn the true model using a conditional likelihood based estimator, even when the samples come from a metastable distribution concentrated in a small region of the state space. Explicit examples of such metastable states can be constructed from regions that effectively bottleneck the probability flow and cause poor mixing of the Markov chain. For specific cases of binary pairwise undirected graphical models (i.e. Ising models), we extend our results to further rigorously show that data coming from metastable states can be used to learn the parameters of the energy function and recover the structure of the model.
MLDec 5, 2025
Symmetric Linear Dynamical Systems are Learnable from Few ObservationsMinh Vu, Andrey Y. Lokhov, Marc Vuffray
We consider the problem of learning the parameters of a $N$-dimensional stochastic linear dynamics under both full and partial observations from a single trajectory of time $T$. We introduce and analyze a new estimator that achieves a small maximum element-wise error on the recovery of symmetric dynamic matrices using only $T=\mathcal{O}(\log N)$ observations, irrespective of whether the matrix is sparse or dense. This estimator is based on the method of moments and does not rely on problem-specific regularization. This is especially important for applications such as structure discovery.
QUANT-PHSep 3, 2021
High-quality Thermal Gibbs Sampling with Quantum Annealing HardwareJon Nelson, Marc Vuffray, Andrey Y. Lokhov et al.
Quantum Annealing (QA) was originally intended for accelerating the solution of combinatorial optimization tasks that have natural encodings as Ising models. However, recent experiments on QA hardware platforms have demonstrated that, in the operating regime corresponding to weak interactions, the QA hardware behaves like a noisy Gibbs sampler at a hardware-specific effective temperature. This work builds on those insights and identifies a class of small hardware-native Ising models that are robust to noise effects and proposes a procedure for executing these models on QA hardware to maximize Gibbs sampling performance. Experimental results indicate that the proposed protocol results in high-quality Gibbs samples from a hardware-specific effective temperature. Furthermore, we show that this effective temperature can be adjusted by modulating the annealing time and energy scale. The procedure proposed in this work provides an approach to using QA hardware for Ising model sampling presenting potential new opportunities for applications in machine learning and physics simulation.
QUANT-PHApr 7, 2021
Single-Qubit Fidelity Assessment of Quantum Annealing HardwareJon Nelson, Marc Vuffray, Andrey Y. Lokhov et al.
As a wide variety of quantum computing platforms become available, methods for assessing and comparing the performance of these devices are of increasing interest and importance. Inspired by the success of single-qubit error rate computations for tracking the progress of gate-based quantum computers, this work proposes a Quantum Annealing Single-qubit Assessment (QASA) protocol for quantifying the performance of individual qubits in quantum annealing computers. The proposed protocol scales to large quantum annealers with thousands of qubits and provides unique insights into the distribution of qubit properties within a particular hardware device. The efficacy of the QASA protocol is demonstrated by analyzing the properties of a D-Wave 2000Q system, revealing unanticipated correlations in the qubit performance of that device. A study repeating the QASA protocol at different annealing times highlights how the method can be utilized to understand the impact of annealing parameters on qubit performance. Overall, the proposed QASA protocol provides a useful tool for assessing the performance of current and emerging quantum annealing devices.
LGApr 2, 2021
Exponential Reduction in Sample Complexity with Learning of Ising Model DynamicsArkopal Dutt, Andrey Y. Lokhov, Marc Vuffray et al.
The usual setting for learning the structure and parameters of a graphical model assumes the availability of independent samples produced from the corresponding multivariate probability distribution. However, for many models the mixing time of the respective Markov chain can be very large and i.i.d. samples may not be obtained. We study the problem of reconstructing binary graphical models from correlated samples produced by a dynamical process, which is natural in many applications. We analyze the sample complexity of two estimators that are based on the interaction screening objective and the conditional likelihood loss. We observe that for samples coming from a dynamical process far from equilibrium, the sample complexity reduces exponentially compared to a dynamical process that mixes quickly.
LGFeb 18, 2021
Learning Continuous Exponential Families Beyond GaussianChristopher X. Ren, Sidhant Misra, Marc Vuffray et al.
We address the problem of learning of continuous exponential family distributions with unbounded support. While a lot of progress has been made on learning of Gaussian graphical models, we still lack scalable algorithms for reconstructing general continuous exponential families modeling higher-order moments of the data beyond the mean and the covariance. Here, we introduce a computationally efficient method for learning continuous graphical models based on the Interaction Screening approach. Through a series of numerical experiments, we show that our estimator maintains similar requirements in terms of accuracy and sample complexity scalings compared to alternative approaches such as maximization of conditional likelihood, while considerably improving upon the algorithm's run-time.
QUANT-PHDec 16, 2020
Programmable Quantum Annealers as Noisy Gibbs SamplersMarc Vuffray, Carleton Coffrin, Yaroslav A. Kharkov et al.
Drawing independent samples from high-dimensional probability distributions represents the major computational bottleneck for modern algorithms, including powerful machine learning frameworks such as deep learning. The quest for discovering larger families of distributions for which sampling can be efficiently realized has inspired an exploration beyond established computing methods and turning to novel physical devices that leverage the principles of quantum computation. Quantum annealing embodies a promising computational paradigm that is intimately related to the complexity of energy landscapes in Gibbs distributions, which relate the probabilities of system states to the energies of these states. Here, we study the sampling properties of physical realizations of quantum annealers which are implemented through programmable lattices of superconducting flux qubits. Comprehensive statistical analysis of the data produced by these quantum machines shows that quantum annealers behave as samplers that generate independent configurations from low-temperature noisy Gibbs distributions. We show that the structure of the output distribution probes the intrinsic physical properties of the quantum device such as effective temperature of individual qubits and magnitude of local qubit noise, which result in a non-linear response function and spurious interactions that are absent in the hardware implementation. We anticipate that our methodology will find widespread use in characterization of future generations of quantum annealers and other emerging analog computing devices.
LGJun 21, 2020
Learning of Discrete Graphical Models with Neural NetworksAbhijith J., Andrey Y. Lokhov, Sidhant Misra et al.
Graphical models are widely used in science to represent joint probability distributions with an underlying conditional dependence structure. The inverse problem of learning a discrete graphical model given i.i.d samples from its joint distribution can be solved with near-optimal sample complexity using a convex optimization method known as Generalized Regularized Interaction Screening Estimator (GRISE). But the computational cost of GRISE becomes prohibitive when the energy function of the true graphical model has higher-order terms. We introduce NeurISE, a neural net based algorithm for graphical model learning, to tackle this limitation of GRISE. We use neural nets as function approximators in an Interaction Screening objective function. The optimization of this objective then produces a neural-net representation for the conditionals of the graphical model. NeurISE algorithm is seen to be a better alternative to GRISE when the energy function of the true model has a high order with a high degree of symmetry. In these cases NeurISE is able to find the correct parsimonious representation for the conditionals without being fed any prior information about the true model. NeurISE can also be used to learn the underlying structure of the true model with some simple modifications to its training procedure. In addition, we also show a variant of NeurISE that can be used to learn a neural net representation for the full energy function of the true model.
SPNov 14, 2019
Real-time Anomaly Detection and Classification in Streaming PMU DataChristopher Hannon, Deepjyoti Deka, Dong Jin et al.
Ensuring secure and reliable operations of the power grid is a primary concern of system operators. Phasor measurement units (PMUs) are rapidly being deployed in the grid to provide fast-sampled operational data that should enable quicker decision-making. This work presents a general interpretable framework for analyzing real-time PMU data, and thus enabling grid operators to understand the current state and to identify anomalies on the fly. Applying statistical learning tools on the streaming data, we first learn an effective dynamical model to describe the current behavior of the system. Next, we use the probabilistic predictions of our learned model to define in a principled way an efficient anomaly detection tool. Finally, the last module of our framework produces on-the-fly classification of the detected anomalies into common occurrence classes using features that grid operators are familiar with. We demonstrate the efficacy of our interpretable approach through extensive numerical experiments on real PMU data collected from a transmission operator in the USA.
LGFeb 2, 2019
Efficient Learning of Discrete Graphical ModelsMarc Vuffray, Sidhant Misra, Andrey Y. Lokhov
Graphical models are useful tools for describing structured high-dimensional probability distributions. Development of efficient algorithms for learning graphical models with least amount of data remains an active research topic. Reconstruction of graphical models that describe the statistics of discrete variables is a particularly challenging problem, for which the maximum likelihood approach is intractable. In this work, we provide the first sample-efficient method based on the Interaction Screening framework that allows one to provably learn fully general discrete factor models with node-specific discrete alphabets and multi-body interactions, specified in an arbitrary basis. We identify a single condition related to model parametrization that leads to rigorous guarantees on the recovery of model structure and parameters in any error norm, and is readily verifiable for a large class of models. Importantly, our bounds make explicit distinction between parameters that are proper to the model and priors used as an input to the algorithm. Finally, we show that the Interaction Screening framework includes all models previously considered in the literature as special cases, and for which our analysis shows a systematic improvement in sample complexity.
SYOct 27, 2017
Online Learning of Power Transmission DynamicsAndrey Y. Lokhov, Marc Vuffray, Dmitry Shemetov et al.
We consider the problem of reconstructing the dynamic state matrix of transmission power grids from time-stamped PMU measurements in the regime of ambient fluctuations. Using a maximum likelihood based approach, we construct a family of convex estimators that adapt to the structure of the problem depending on the available prior information. The proposed method is fully data-driven and does not assume any knowledge of system parameters. It can be implemented in near real-time and requires a small amount of data. Our learning algorithms can be used for model validation and calibration, and can also be applied to related problems of system stability, detection of forced oscillations, generation re-dispatch, as well as to the estimation of the system state.
LGMar 15, 2017
Information Theoretic Optimal Learning of Gaussian Graphical ModelsSidhant Misra, Marc Vuffray, Andrey Y. Lokhov
What is the optimal number of independent observations from which a sparse Gaussian Graphical Model can be correctly recovered? Information-theoretic arguments provide a lower bound on the minimum number of samples necessary to perfectly identify the support of any multivariate normal distribution as a function of model parameters. For a model defined on a sparse graph with $p$ nodes, a maximum degree $d$ and minimum normalized edge strength $κ$, this necessary number of samples scales at least as $d \log p/κ^2$. The sample complexity requirements of existing methods for perfect graph reconstruction exhibit dependency on additional parameters that do not enter in the lower bound. The question of whether the lower bound is tight and achievable by a polynomial time algorithm remains open. In this paper, we constructively answer this question and propose an algorithm, termed DICE, whose sample complexity matches the information-theoretic lower bound up to a universal constant factor. We also propose a related algorithm SLICE that has a slightly higher sample complexity, but can be implemented as a mixed integer quadratic program which makes it attractive in practice. Importantly, SLICE retains a critical advantage of DICE in that its sample complexity only depends on quantities present in the information theoretic lower bound. We anticipate that this result will stimulate future search of computationally efficient sample-optimal algorithms.
STAT-MECHDec 15, 2016
Optimal structure and parameter learning of Ising modelsAndrey Y. Lokhov, Marc Vuffray, Sidhant Misra et al.
Reconstruction of structure and parameters of an Ising model from binary samples is a problem of practical importance in a variety of disciplines, ranging from statistical physics and computational biology to image processing and machine learning. The focus of the research community shifted towards developing universal reconstruction algorithms which are both computationally efficient and require the minimal amount of expensive data. We introduce a new method, Interaction Screening, which accurately estimates the model parameters using local optimization problems. The algorithm provably achieves perfect graph structure recovery with an information-theoretically optimal number of samples, notably in the low-temperature regime which is known to be the hardest for learning. The efficacy of Interaction Screening is assessed through extensive numerical tests on synthetic Ising models of various topologies with different types of interactions, as well as on a real data produced by a D-Wave quantum computer. This study shows that the Interaction Screening method is an exact, tractable and optimal technique universally solving the inverse Ising problem.
SYJun 21, 2016
Graphical Models for Optimal Power FlowKrishnamurthy Dvijotham, Pascal Van Hentenryck, Michael Chertkov et al.
Optimal power flow (OPF) is the central optimization problem in electric power grids. Although solved routinely in the course of power grid operations, it is known to be strongly NP-hard in general, and weakly NP-hard over tree networks. In this paper, we formulate the optimal power flow problem over tree networks as an inference problem over a tree-structured graphical model where the nodal variables are low-dimensional vectors. We adapt the standard dynamic programming algorithm for inference over a tree-structured graphical model to the OPF problem. Combining this with an interval discretization of the nodal variables, we develop an approximation algorithm for the OPF problem. Further, we use techniques from constraint programming (CP) to perform interval computations and adaptive bound propagation to obtain practically efficient algorithms. Compared to previous algorithms that solve OPF with optimality guarantees using convex relaxations, our approach is able to work for arbitrary distribution networks and handle mixed-integer optimization problems. Further, it can be implemented in a distributed message-passing fashion that is scalable and is suitable for "smart grid" applications like control of distributed energy resources. We evaluate our technique numerically on several benchmark networks and show that practical OPF problems can be solved effectively using this approach.
LGMay 24, 2016
Interaction Screening: Efficient and Sample-Optimal Learning of Ising ModelsMarc Vuffray, Sidhant Misra, Andrey Y. Lokhov et al.
We consider the problem of learning the underlying graph of an unknown Ising model on p spins from a collection of i.i.d. samples generated from the model. We suggest a new estimator that is computationally efficient and requires a number of samples that is near-optimal with respect to previously established information-theoretic lower-bound. Our statistical estimator has a physical interpretation in terms of "interaction screening". The estimator is consistent and is efficiently implemented using convex optimization. We prove that with appropriate regularization, the estimator recovers the underlying graph using a number of samples that is logarithmic in the system size p and exponential in the maximum coupling-intensity and maximum node-degree.
SYJun 19, 2015
Natural Gas Flow Solutions with Guarantees: A Monotone Operator Theory ApproachKrishnamurthy Dvijotham, Marc Vuffray, Sidhant Misra et al.
We consider balanced flows in a natural gas transmission network and discuss computationally hard problems such as establishing if solution of the underlying nonlinear gas flow equations exists, if it is unique, and finding the solution. Particular topologies, e.g. trees, are known to be easy to solve based on a variational description of the gas flow equations, but these approaches do not generalize. In this paper, we show that the gas flow problem can be solved efficiently using the tools of monotone operator theory, provided that we look for solution within certain monotonicity domains. We characterize a family of monotonicity domains, described in terms of Linear Matrix Inequalities (LMI) in the state variables, each containing at most one solution. We also develop an efficient algorithm to choose a particular monotonicity domain, for which the LMI based condition simplifies to a bound on the flows. Performance of the technique is illustrated on exemplary gas networks.