EM LG ST ME MLDec 28, 2025

Causal-Policy Forest for End-to-End Policy Learning

arXiv:2512.22846v1h-index: 1

Originality Incremental advance

AI Analysis

This work addresses the problem of bridging policy learning and CATE estimation for researchers and practitioners in causal inference, though it is incremental as it builds on existing methods.

The study tackles policy learning in causal inference by proposing an end-to-end algorithm called causal-policy forest, which modifies causal forest to train policies that recommend optimal treatments, showing equivalence between maximizing policy value and minimizing mean squared error for CATE under restricted regression models.

This study proposes an end-to-end algorithm for policy learning in causal inference. We observe data consisting of covariates, treatment assignments, and outcomes, where only the outcome corresponding to the assigned treatment is observed. The goal of policy learning is to train a policy from the observed data, where a policy is a function that recommends an optimal treatment for each individual, to maximize the policy value. In this study, we first show that maximizing the policy value is equivalent to minimizing the mean squared error for the conditional average treatment effect (CATE) under $\{-1, 1\}$ restricted regression models. Based on this finding, we modify the causal forest, an end-to-end CATE estimation algorithm, for policy learning. We refer to our algorithm as the causal-policy forest. Our algorithm has three advantages. First, it is a simple modification of an existing, widely used CATE estimation method, therefore, it helps bridge the gap between policy learning and CATE estimation in practice. Second, while existing studies typically estimate nuisance parameters for policy learning as a separate task, our algorithm trains the policy in a more end-to-end manner. Third, as in standard decision trees and random forests, we train the models efficiently, avoiding computational intractability.

View on arXiv PDF

Similar