On the Effects of Irrelevant Variables in Treatment Effect Estimation with Deep Disentanglement
This addresses a specific bottleneck in causal inference for domains like healthcare and economics, though it is an incremental improvement over existing deep disentanglement approaches.
The paper tackles the problem of irrelevant variables degrading treatment effect estimation in observational data by explicitly identifying and representing them in a separate embedding space, resulting in better prediction accuracy and robustness compared to previous methods.
Estimating treatment effects from observational data is paramount in healthcare, education, and economics, but current deep disentanglement-based methods to address selection bias are insufficiently handling irrelevant variables. We demonstrate in experiments that this leads to prediction errors. We disentangle pre-treatment variables with a deep embedding method and explicitly identify and represent irrelevant variables, additionally to instrumental, confounding and adjustment latent factors. To this end, we introduce a reconstruction objective and create an embedding space for irrelevant variables using an attached autoencoder. Instead of relying on serendipitous suppression of irrelevant variables as in previous deep disentanglement approaches, we explicitly force irrelevant variables into this embedding space and employ orthogonalization to prevent irrelevant information from leaking into the latent space representations of the other factors. Our experiments with synthetic and real-world benchmark datasets show that we can better identify irrelevant variables and more precisely predict treatment effects than previous methods, while prediction quality degrades less when additional irrelevant variables are introduced.