Empirical Bayesian Approaches for Robust Constraint-based Causal Discovery under Insufficient Data
This addresses a practical limitation for researchers and practitioners using causal discovery in data-scarce real-world applications, representing an incremental but important enhancement.
The paper tackles the problem of constraint-based causal discovery failing under insufficient data by proposing Bayesian-augmented frequentist independence tests, resulting in significant performance improvements in accuracy and efficiency over state-of-the-art methods on benchmark datasets.
Causal discovery is to learn cause-effect relationships among variables given observational data and is important for many applications. Existing causal discovery methods assume data sufficiency, which may not be the case in many real world datasets. As a result, many existing causal discovery methods can fail under limited data. In this work, we propose Bayesian-augmented frequentist independence tests to improve the performance of constraint-based causal discovery methods under insufficient data: 1) We firstly introduce a Bayesian method to estimate mutual information (MI), based on which we propose a robust MI based independence test; 2) Secondly, we consider the Bayesian estimation of hypothesis likelihood and incorporate it into a well-defined statistical test, resulting in a robust statistical testing based independence test. We apply proposed independence tests to constraint-based causal discovery methods and evaluate the performance on benchmark datasets with insufficient samples. Experiments show significant performance improvement in terms of both accuracy and efficiency over SOTA methods.