Independence Testing for Bounded Degree Bayesian Network
This work addresses the challenge of efficient independence testing for high-dimensional distributions with sparse dependencies, offering a significant reduction from exponential to linear sample complexity for researchers in statistics and machine learning.
The paper tackles the problem of independence testing for distributions with sparse Bayesian network structures, showing that only linearly many samples are required when the underlying DAG has bounded in-degree, specifically providing necessary and sufficient sample complexity of θ̃(2^{d/2}·n/ε²).
We study the following independence testing problem: given access to samples from a distribution $P$ over $\{0,1\}^n$, decide whether $P$ is a product distribution or whether it is $\varepsilon$-far in total variation distance from any product distribution. For arbitrary distributions, this problem requires $\exp(n)$ samples. We show in this work that if $P$ has a sparse structure, then in fact only linearly many samples are required. Specifically, if $P$ is Markov with respect to a Bayesian network whose underlying DAG has in-degree bounded by $d$, then $\tildeΘ(2^{d/2}\cdot n/\varepsilon^2)$ samples are necessary and sufficient for independence testing.