Detecting low-complexity unobserved causes
This work addresses a problem in statistical genetics for determining whether genetic markers are causal or merely correlated with causal ones, but it appears incremental as it builds on existing causal inference methods.
The paper tackles the problem of distinguishing between direct causal links and indirect ones mediated by low-complexity unobserved variables, such as binary variables, using a method based on analyzing conditional distributions in a simplex. It reports encouraging results on semi-empirical data, though no specific numbers are provided.
We describe a method that infers whether statistical dependences between two observed variables X and Y are due to a "direct" causal link or only due to a connecting causal path that contains an unobserved variable of low complexity, e.g., a binary variable. This problem is motivated by statistical genetics. Given a genetic marker that is correlated with a phenotype of interest, we want to detect whether this marker is causal or it only correlates with a causal one. Our method is based on the analysis of the location of the conditional distributions P(Y|x) in the simplex of all distributions of Y. We report encouraging results on semi-empirical data.