AI LGFeb 26

Causal Identification from Counterfactual Data: Completeness and Bounding Results

arXiv:2602.23541v22.41 citationsh-index: 9

Originality Highly original

AI Analysis

This work provides a foundational understanding of the limits of causal inference for researchers and practitioners working with counterfactual data, particularly in non-parametric settings. It is a significant theoretical advancement in causal identification.

This paper addresses the identification of counterfactual quantities when some Layer 3 counterfactual data is directly estimable. The authors develop the CTFIDU+ algorithm, proving its completeness for identifying counterfactual queries from arbitrary Layer 3 distributions, and establish the theoretical limits of identification from physically realizable distributions. For non-identifiable counterfactuals, they derive novel analytic bounds, showing that counterfactual data tightens these bounds in simulations.

Previous work establishing completeness results for counterfactual identification has been circumscribed to the setting where the input data belongs to observational or interventional distributions (Layers 1 and 2 of Pearl's Causal Hierarchy), since it was generally presumed impossible to obtain data from counterfactual distributions, which belong to Layer 3. However, recent work (Raghavan & Bareinboim, 2025) has formally characterized a family of counterfactual distributions which can be directly estimated via experimental methods - a notion they call counterfactual realizabilty. This leaves open the question of what additional counterfactual quantities now become identifiable, given this new access to (some) Layer 3 data. To answer this question, we develop the CTFIDU+ algorithm for identifying counterfactual queries from an arbitrary set of Layer 3 distributions, and prove that it is complete for this task. Building on this, we establish the theoretical limit of which counterfactuals can be identified from physically realizable distributions, thus implying the fundamental limit to exact causal inference in the non-parametric setting. Finally, given the impossibility of identifying certain critical types of counterfactuals, we derive novel analytic bounds for such quantities using realizable counterfactual data, and corroborate using simulations that counterfactual data helps tighten the bounds for non-identifiable quantities in practice.

View on arXiv PDF

Similar