LGDec 28, 2025
Adapting, Fast and Slow: Transportable Circuits for Few-Shot LearningKasra Jalaldoust, Elias Bareinboim
Generalization across the domains is not possible without asserting a structure that constrains the unseen target domain w.r.t. the source domain. Building on causal transportability theory, we design an algorithm for zero-shot compositional generalization which relies on access to qualitative domain knowledge in form of a causal graph for intra-domain structure and discrepancies oracle for inter-domain mechanism sharing. \textit{Circuit-TR} learns a collection of modules (i.e., local predictors) from the source data, and transport/compose them to obtain a circuit for prediction in the target domain if the causal structure licenses. Furthermore, circuit transportability enables us to design a supervised domain adaptation scheme that operates without access to an explicit causal structure, and instead uses limited target data. Our theoretical results characterize classes of few-shot learnable tasks in terms of graphical circuit transportability criteria, and connects few-shot generalizability with the established notion of circuit size complexity; controlled simulations corroborate our theoretical results.
LGMay 24
Abduction-Deduction Entanglement: Domain Generalization via Representation TransplantsKasra Jalaldoust, Elias Bareinboum
Prediction models trained under the source distribution do not generalize well to a different target distribution. A valid inference about an unseen data distribution must be anchored by the invariance of certain causal mechanisms that generate the source and target data, however, these structural invariances are non-identifiable from the source data alone. Under mild causal assumptions about the data, we show that the optimal prediction in the target is in fact partially identifiable by the source distribution. The result rests on a simple observation: In any domain, the optimal prediction can be factorized into what we call a pair of abduction and deduction maps, where the abduction map makes inference about some unobserved variables (possibly confounders) from the observed variables and the deduction map predicts the label using both the observed and inferred quantities. Access to large source data pins down the optimal prediction, thus constrains the valid abduction-deduction ensembles that produce it -- a non-identifiability that we call the abduction-deduction entanglement. To leverage this, we parameterize the constrained family using what we call a representation transplant, that is a specific linear transformation in the representation space that manipulates the abduction content of the representation while retaining the deduction component. Invariance of the causal mechanism generating the label implies existence of an invariant deduction map between source and target. Thus, we can search the space of plausible target distributions via a parametric transplant. We use this scheme in a learner-adversary game that, under an idealistic optimization, provably terminates with the learner having the minimax-optimal target prediction. Evaluations verify the theory, showing that the method is competitive in DG benchmarks.
LGMar 30, 2025
Partial Transportability for Domain GeneralizationKasra Jalaldoust, Alexis Bellot, Elias Bareinboim
A fundamental task in AI is providing performance guarantees for predictions made in unseen domains. In practice, there can be substantial uncertainty about the distribution of new data, and corresponding variability in the performance of existing predictors. Building on the theory of partial identification and transportability, this paper introduces new results for bounding the value of a functional of the target distribution, such as the generalization error of a classifier, given data from source domains and assumptions about the data generating mechanisms, encoded in causal diagrams. Our contribution is to provide the first general estimation technique for transportability problems, adapting existing parameterization schemes such Neural Causal Models to encode the structural constraints necessary for cross-population inference. We demonstrate the expressiveness and consistency of this procedure and further propose a gradient-based optimization scheme for making scalable inferences in practice. Our results are corroborated with experiments.
LGApr 30, 2025
Multi-Domain Causal Discovery in Bijective Causal ModelsKasra Jalaldoust, Saber Salehkaleybar, Negar Kiyavash
We consider the problem of causal discovery (a.k.a., causal structure learning) in a multi-domain setting. We assume that the causal functions are invariant across the domains, while the distribution of the exogenous noise may vary. Under causal sufficiency (i.e., no confounders exist), we show that the causal diagram can be discovered under less restrictive functional assumptions compared to previous work. What enables causal discovery in this setting is bijective generation mechanisms (BGM), which ensures that the functional relation between the exogenous noise $E$ and the endogenous variable $Y$ is bijective and differentiable in both directions at every level of the cause variable $X = x$. BGM generalizes a variety of models including additive noise model, LiNGAM, post-nonlinear model, and location-scale noise model. Further, we derive a statistical test to find the parents set of the target variable. Experiments on various synthetic and real-world datasets validate our theoretical findings.