MLLGOct 24, 2019

Integrating overlapping datasets using bivariate causal discovery

arXiv:1910.11356v223 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of causal discovery when variables are not all measured in a single dataset, which is a common issue in scientific domains, but it is incremental as it builds on existing bivariate methods.

The paper tackled the problem of learning causal structures from multiple overlapping datasets, which previous methods could only handle by returning many possible structures due to reliance on conditional independence tests. It adapted bivariate causal discovery algorithms to this setting, providing a sound and complete algorithm that outperformed previous approaches on synthetic and real data.

Causal knowledge is vital for effective reasoning in science, as causal relations, unlike correlations, allow one to reason about the outcomes of interventions. Algorithms that can discover causal relations from observational data are based on the assumption that all variables have been jointly measured in a single dataset. In many cases this assumption fails. Previous approaches to overcoming this shortcoming devised algorithms that returned all joint causal structures consistent with the conditional independence information contained in each individual dataset. But, as conditional independence tests only determine causal structure up to Markov equivalence, the number of consistent joint structures returned by these approaches can be quite large. The last decade has seen the development of elegant algorithms for discovering causal relations beyond conditional independence, which can distinguish among Markov equivalent structures. In this work we adapt and extend these so-called bivariate causal discovery algorithms to the problem of learning consistent causal structures from multiple datasets with overlapping variables belonging to the same generating process, providing a sound and complete algorithm that outperforms previous approaches on synthetic and real data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes