Conditional Dependence via Shannon Capacity: Axioms, Estimators and Applications
This provides a method for estimating causal relationships with reduced sample requirements, which is useful for fields like biology and data analysis, though it appears incremental as it builds on existing Shannon capacity concepts.
The paper tackles the problem of estimating causal strength between variables by proposing Shannon capacity as a measure based on conditional distributions, and introduces a novel fixed-k nearest neighbor estimator that reduces sample complexity in single-cell flow-cytometry applications.
We conduct an axiomatic study of the problem of estimating the strength of a known causal relationship between a pair of variables. We propose that an estimate of causal strength should be based on the conditional distribution of the effect given the cause (and not on the driving distribution of the cause), and study dependence measures on conditional distributions. Shannon capacity, appropriately regularized, emerges as a natural measure under these axioms. We examine the problem of calculating Shannon capacity from the observed samples and propose a novel fixed-$k$ nearest neighbor estimator, and demonstrate its consistency. Finally, we demonstrate an application to single-cell flow-cytometry, where the proposed estimators significantly reduce sample complexity.