Learning Instrumental Variable from Data Fusion for Treatment Effect Estimation
This addresses treatment effect estimation in data fusion for fields like causal inference, but it is incremental as it builds on existing IV methods with a novel adaptation.
The paper tackles the problem of estimating treatment effects from fused datasets with multiple sources and unmeasured confounders by reconstructing source labels as a Group Instrumental Variable (GIV) and using IV-based regression, achieving advantages over state-of-the-art methods in empirical results.
The advent of the big data era brought new opportunities and challenges to draw treatment effect in data fusion, that is, a mixed dataset collected from multiple sources (each source with an independent treatment assignment mechanism). Due to possibly omitted source labels and unmeasured confounders, traditional methods cannot estimate individual treatment assignment probability and infer treatment effect effectively. Therefore, we propose to reconstruct the source label and model it as a Group Instrumental Variable (GIV) to implement IV-based Regression for treatment effect estimation. In this paper, we conceptualize this line of thought and develop a unified framework (Meta-EM) to (1) map the raw data into a representation space to construct Linear Mixed Models for the assigned treatment variable; (2) estimate the distribution differences and model the GIV for the different treatment assignment mechanisms; and (3) adopt an alternating training strategy to iteratively optimize the representations and the joint distribution to model GIV for IV regression. Empirical results demonstrate the advantages of our Meta-EM compared with state-of-the-art methods.