Multi-Domain Causal Representation Learning via Weak Distributional Invariances
This work addresses a key limitation in causal representation learning for multi-domain data, offering a more applicable approach but is incremental in nature.
The paper tackles the problem of learning causal representations from multi-domain datasets by relaxing assumptions about perfect interventions, showing that autoencoders incorporating weak distributional invariances can provably identify stable latents across settings.
Causal representation learning has emerged as the center of action in causal machine learning research. In particular, multi-domain datasets present a natural opportunity for showcasing the advantages of causal representation learning over standard unsupervised representation learning. While recent works have taken crucial steps towards learning causal representations, they often lack applicability to multi-domain datasets due to over-simplifying assumptions about the data; e.g. each domain comes from a different single-node perfect intervention. In this work, we relax these assumptions and capitalize on the following observation: there often exists a subset of latents whose certain distributional properties (e.g., support, variance) remain stable across domains; this property holds when, for example, each domain comes from a multi-node imperfect intervention. Leveraging this observation, we show that autoencoders that incorporate such invariances can provably identify the stable set of latents from the rest across different settings.