Testing Ising Models
This addresses the intractability of distribution testing in high dimensions for statisticians and computer scientists, offering a novel approach for structured data, though it is incremental as it builds on existing testing problems by applying them to a specific model.
The paper tackles the problem of testing independence and goodness-of-fit for high-dimensional distributions, which typically requires exponential sample complexity, by focusing on structured distributions like Ising models. It demonstrates that in this setting, sample and time efficient testers can be developed, avoiding the curse of dimensionality.
Given samples from an unknown multivariate distribution $p$, is it possible to distinguish whether $p$ is the product of its marginals versus $p$ being far from every product distribution? Similarly, is it possible to distinguish whether $p$ equals a given distribution $q$ versus $p$ and $q$ being far from each other? These problems of testing independence and goodness-of-fit have received enormous attention in statistics, information theory, and theoretical computer science, with sample-optimal algorithms known in several interesting regimes of parameters. Unfortunately, it has also been understood that these problems become intractable in large dimensions, necessitating exponential sample complexity. Motivated by the exponential lower bounds for general distributions as well as the ubiquity of Markov Random Fields (MRFs) in the modeling of high-dimensional distributions, we initiate the study of distribution testing on structured multivariate distributions, and in particular the prototypical example of MRFs: the Ising Model. We demonstrate that, in this structured setting, we can avoid the curse of dimensionality, obtaining sample and time efficient testers for independence and goodness-of-fit. One of the key technical challenges we face along the way is bounding the variance of functions of the Ising model.