Linear Causal Disentanglement via Interventions
This work addresses the identifiability challenge in causal representation learning, which is incremental as it builds on existing linear models with interventions.
The paper tackles the problem of causal disentanglement in linear latent models, showing that interventions on each latent variable are necessary and sufficient for identifiability, and provides a method that accurately recovers the latent causal model.
Causal disentanglement seeks a representation of data involving latent variables that relate to one another via a causal model. A representation is identifiable if both the latent model and the transformation from latent to observed variables are unique. In this paper, we study observed variables that are a linear transformation of a linear latent causal model. Data from interventions are necessary for identifiability: if one latent variable is missing an intervention, we show that there exist distinct models that cannot be distinguished. Conversely, we show that a single intervention on each latent variable is sufficient for identifiability. Our proof uses a generalization of the RQ decomposition of a matrix that replaces the usual orthogonal and upper triangular conditions with analogues depending on a partial order on the rows of the matrix, with partial order determined by a latent causal model. We corroborate our theoretical results with a method for causal disentanglement that accurately recovers a latent causal model.