ML LGJan 31, 2020

Causal Structure Discovery from Distributions Arising from Mixtures of DAGs

Basil Saeed, Snigdha Panigrahi, Caroline Uhler

arXiv:2001.11940v214.732 citations

Originality Incremental advance

AI Analysis

This addresses causal discovery in complex, heterogeneous data for researchers in statistics and machine learning, but it is incremental as it extends existing latent variable methods to mixture models.

The paper tackles the problem of learning causal structures from distributions that are mixtures of directed acyclic graphs (DAGs), proving that a graphical representation encodes conditional independence relations and showing that algorithms like FCI recover a union of component DAGs and identify variables with varying distributions across components.

We consider distributions arising from a mixture of causal models, where each model is represented by a directed acyclic graph (DAG). We provide a graphical representation of such mixture distributions and prove that this representation encodes the conditional independence relations of the mixture distribution. We then consider the problem of structure learning based on samples from such distributions. Since the mixing variable is latent, we consider causal structure discovery algorithms such as FCI that can deal with latent variables. We show that such algorithms recover a "union" of the component DAGs and can identify variables whose conditional distribution across the component DAGs vary. We demonstrate our results on synthetic and real data showing that the inferred graph identifies nodes that vary between the different mixture components. As an immediate application, we demonstrate how retrieval of this causal information can be used to cluster samples according to each mixture component.

View on arXiv PDF

Similar