MLLGMEMar 19

Multi-Domain Causal Empirical Bayes Under Linear Mixing

arXiv:2603.1840491.21 citationsh-index: 10
AI Analysis

This work addresses the estimation challenge in causal representation learning for researchers in machine learning, though it is incremental as it builds on known identifiability results with a novel method.

The paper tackles the problem of estimating causal latent variables from multi-domain data by proposing an empirical Bayes algorithm that leverages invariant structure, achieving more accurate estimation than existing methods on synthetic data.

Causal representation learning (CRL) aims to learn low-dimensional causal latent variables from high-dimensional observations. While identifiability has been extensively studied for CRL, estimation has been less explored. In this paper, we explore the use of empirical Bayes (EB) to estimate causal representations. In particular, we consider the problem of learning from data from multiple domains, where differences between domains are modeled by interventions in a shared underlying causal model. Multi-domain CRL naturally poses a simultaneous inference problem that EB is designed to tackle. Here, we propose an EB $f$-modeling algorithm that improves the quality of learned causal variables by exploiting invariant structure within and across domains. Specifically, we consider a linear measurement model and interventional priors arising from a shared acyclic SCM. When the graph and intervention targets are known, we develop an EM-style algorithm based on causally structured score matching. We further discuss EB $\rmg$-modeling in the context of existing CRL approaches. In experiments on synthetic data, our proposed method achieves more accurate estimation than other methods for CRL.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes