MLLGMar 12, 2019

Testing Conditional Independence on Discrete Data using Stochastic Complexity

arXiv:1903.04829v126 citations
Originality Incremental advance
AI Analysis

This work addresses a core bottleneck in causal discovery for researchers and practitioners, offering an incremental improvement over existing tests.

The paper tackles the problem of testing conditional independence on discrete data, which often fails in practice, by proposing a new test called SCI based on algorithmic independence and stochastic complexity, resulting in lower type II error and higher recall in causal discovery without compromising precision.

Testing for conditional independence is a core aspect of constraint-based causal discovery. Although commonly used tests are perfect in theory, they often fail to reject independence in practice, especially when conditioning on multiple variables. We focus on discrete data and propose a new test based on the notion of algorithmic independence that we instantiate using stochastic complexity. Amongst others, we show that our proposed test, SCI, is an asymptotically unbiased as well as $L_2$ consistent estimator for conditional mutual information (CMI). Further, we show that SCI can be reformulated to find a sensible threshold for CMI that works well on limited samples. Empirical evaluation shows that SCI has a lower type II error than commonly used tests. As a result, we obtain a higher recall when we use SCI in causal discovery algorithms, without compromising the precision.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes