Beyond identifiability: Learning causal representations with few environments and finite samples
This work addresses the challenge of estimation and finite-sample bounds in causal representation learning, which is incremental by providing explicit guarantees for a previously identified bottleneck.
The paper tackled the problem of learning causal representations with finite-sample guarantees using a sublinear number of environments, showing that consistent recovery of the latent causal graph, mixing matrix, and unknown intervention targets can be achieved with only a logarithmic number of interventions.
We provide explicit, finite-sample guarantees for learning causal representations from data with a sublinear number of environments. Causal representation learning seeks to provide a rigourous foundation for the general representation learning problem by bridging causal models with latent factor models in order to learn interpretable representations with causal semantics. Despite a blossoming theory of identifiability in causal representation learning, estimation and finite-sample bounds are less well understood. We show that causal representations can be learned with only a logarithmic number of unknown, multi-node interventions, and that the intervention targets need not be carefully designed in advance. Through a careful perturbation analysis, we provide a new analysis of this problem that guarantees consistent recovery of (a) the latent causal graph, (b) the mixing matrix and representations, and (c) \emph{unknown} intervention targets.