LGJul 16, 2025

Targeted Deep Architectures: A TMLE-Based Framework for Robust Causal Inference in Neural Networks

Yi Li, David Mccoy, Nolan Gunter, Kaitlyn Lee, Alejandro Schuler, Mark van der Laan

arXiv:2507.12435v14.11 citationsh-index: 45

Originality Highly original

AI Analysis

This work addresses the need for robust causal inference in deep learning, particularly for complex multi-parameter targets like survival curves, offering a scalable solution for researchers and practitioners in fields such as healthcare and social sciences.

The paper tackles the problem of obtaining valid causal inference from deep neural networks, which often lack reliable estimates for parameters like treatment effects or survival curves, by proposing Targeted Deep Architectures (TDA) that embed Targeted Maximum Likelihood Estimation (TMLE) directly into the network. The result is reduced bias and improved coverage on benchmark datasets like IHDP and simulated survival data, providing asymptotically valid confidence intervals.

Modern deep neural networks are powerful predictive tools yet often lack valid inference for causal parameters, such as treatment effects or entire survival curves. While frameworks like Double Machine Learning (DML) and Targeted Maximum Likelihood Estimation (TMLE) can debias machine-learning fits, existing neural implementations either rely on "targeted losses" that do not guarantee solving the efficient influence function equation or computationally expensive post-hoc "fluctuations" for multi-parameter settings. We propose Targeted Deep Architectures (TDA), a new framework that embeds TMLE directly into the network's parameter space with no restrictions on the backbone architecture. Specifically, TDA partitions model parameters - freezing all but a small "targeting" subset - and iteratively updates them along a targeting gradient, derived from projecting the influence functions onto the span of the gradients of the loss with respect to weights. This procedure yields plug-in estimates that remove first-order bias and produce asymptotically valid confidence intervals. Crucially, TDA easily extends to multi-dimensional causal estimands (e.g., entire survival curves) by merging separate targeting gradients into a single universal targeting update. Theoretically, TDA inherits classical TMLE properties, including double robustness and semiparametric efficiency. Empirically, on the benchmark IHDP dataset (average treatment effects) and simulated survival data with informative censoring, TDA reduces bias and improves coverage relative to both standard neural-network estimators and prior post-hoc approaches. In doing so, TDA establishes a direct, scalable pathway toward rigorous causal inference within modern deep architectures for complex multi-parameter targets.

View on arXiv PDF

Similar