Long-Term Effect Estimation with Surrogate Representation
This addresses the challenge of accurately predicting long-term outcomes, such as revenue from ads, for researchers and practitioners in causal inference, though it is an incremental improvement by integrating sequential models.
The paper tackles the problem of estimating long-term causal effects in observational studies, where confounding bias and impractical surrogacy assumptions lead to large errors, by proposing a framework that learns surrogate representations to account for temporal unconfoundedness, and it outperforms state-of-the-art methods in experiments.
There are many scenarios where short- and long-term causal effects of an intervention are different. For example, low-quality ads may increase short-term ad clicks but decrease the long-term revenue via reduced clicks. This work, therefore, studies the problem of long-term effect where the outcome of primary interest, or primary outcome, takes months or even years to accumulate. The observational study of long-term effect presents unique challenges. First, the confounding bias causes large estimation error and variance, which can further accumulate towards the prediction of primary outcomes. Second, short-term outcomes are often directly used as the proxy of the primary outcome, i.e., the surrogate. Nevertheless, this method entails the strong surrogacy assumption that is often impractical. To tackle these challenges, we propose to build connections between long-term causal inference and sequential models in machine learning. This enables us to learn surrogate representations that account for the temporal unconfoundedness and circumvent the stringent surrogacy assumption by conditioning on the inferred time-varying confounders. Experimental results show that the proposed framework outperforms the state-of-the-art.