Identifiability Guarantees for Causal Disentanglement from Soft Interventions
This work addresses the challenge of identifying causal structures from incomplete data, which is incremental as it extends existing results from fully observed to unobserved variables.
The paper tackles the problem of causal disentanglement with unobserved latent variables using unpaired observational and interventional data, showing that identifiability can be achieved under a generalized faithfulness assumption, allowing recovery of the latent causal model up to an equivalence class and prediction of unseen intervention effects in the infinite data limit.
Causal disentanglement aims to uncover a representation of data using latent variables that are interrelated through a causal model. Such a representation is identifiable if the latent model that explains the data is unique. In this paper, we focus on the scenario where unpaired observational and interventional data are available, with each intervention changing the mechanism of a latent variable. When the causal variables are fully observed, statistically consistent algorithms have been developed to identify the causal model under faithfulness assumptions. We here show that identifiability can still be achieved with unobserved causal variables, given a generalized notion of faithfulness. Our results guarantee that we can recover the latent causal model up to an equivalence class and predict the effect of unseen combinations of interventions, in the limit of infinite data. We implement our causal disentanglement framework by developing an autoencoding variational Bayes algorithm and apply it to the problem of predicting combinatorial perturbation effects in genomics.