General targeted machine learning for modern causal mediation analysis
This provides a general statistical tool for researchers in fields like epidemiology and social sciences to estimate mediation effects non-parametrically, addressing a bottleneck in causal inference, though it is incremental in combining existing definitions with new estimation techniques.
The paper tackles the lack of non-parametric estimation methods for causal mediation analysis with multiple, continuous, or high-dimensional mediators by proposing a one-step estimation algorithm that unifies six popular definitions and leverages machine learning, achieving √n-convergence and asymptotic normality. It demonstrates the method in simulations and real data to estimate mediation effects, such as in pain management and opioid use disorder.
Causal mediation analyses investigate the mechanisms through which causes exert their effects, and are therefore central to scientific progress. The literature on the non-parametric definition and identification of mediational effects in rigourous causal models has grown significantly in recent years, and there has been important progress to address challenges in the interpretation and identification of such effects. Despite great progress in the causal inference front, statistical methodology for non-parametric estimation has lagged behind, with few or no methods available for tackling non-parametric estimation in the presence of multiple, continuous, or high-dimensional mediators. In this paper we show that the identification formulas for six popular non-parametric approaches to mediation analysis proposed in recent years can be recovered from just two statistical estimands. We leverage this finding to propose an all-purpose one-step estimation algorithm that can be coupled with machine learning in any mediation study that uses any of these six definitions of mediation. The estimators have desirable properties, such as $\sqrt{n}$-convergence and asymptotic normality. Estimating the first-order correction for the one-step estimator requires estimation of complex density ratios on the potentially high-dimensional mediators, a challenge that is solved using recent advancements in so-called Riesz learning. We illustrate the properties of our methods in a simulation study and illustrate its use on real data to estimate the extent to which pain management practices mediate the total effect of having a chronic pain disorder on opioid use disorder.