Weijie Ji

ME
h-index3
3papers
10citations
Novelty75%
AI Score31

3 Papers

MLFeb 17, 2024
Adaptive Split Balancing for Optimal Random Forest

Yuqian Zhang, Weijie Ji, Jelena Bradic

In this paper, we propose a new random forest algorithm that constructs the trees using a novel adaptive split-balancing method. Rather than relying on the widely-used random feature selection, we propose a permutation-based balanced splitting criterion. The adaptive split balancing forest (ASBF), achieves minimax optimality under the Lipschitz class. Its localized version, which fits local regressions at the leaf level, attains the minimax rate under the broad Hölder class $\mathcal{H}^{q,β}$ of problems for any $q\in\mathbb{N}$ and $β\in(0,1]$. We identify that over-reliance on auxiliary randomness in tree construction may compromise the approximation power of trees, leading to suboptimal results. Conversely, the proposed less random, permutation-based approach demonstrates optimality over a wide range of models. Although random forests are known to perform well empirically, their theoretical convergence rates are slow. Simplified versions that construct trees without data dependence offer faster rates but lack adaptability during tree growth. Our proposed method achieves optimality in simple, smooth scenarios while adaptively learning the tree structure from the data. Additionally, we establish uniform upper bounds and demonstrate that ASBF improves dimensionality dependence in average treatment effect estimation problems. Simulation studies and real-world applications demonstrate our methods' superior performance over existing random forests.

MENov 12, 2021
Dynamic treatment effects: high-dimensional inference under model misspecification

Yuqian Zhang, Weijie Ji, Jelena Bradic

Estimating dynamic treatment effects is crucial across various disciplines, providing insights into the time-dependent causal impact of interventions. However, this estimation poses challenges due to time-varying confounding, leading to potentially biased estimates. Furthermore, accurately specifying the growing number of treatment assignments and outcome models with multiple exposures appears increasingly challenging to accomplish. Double robustness, which permits model misspecification, holds great value in addressing these challenges. This paper introduces a novel "sequential model doubly robust" estimator. We develop novel moment-targeting estimates to account for confounding effects and establish that root-$N$ inference can be achieved as long as at least one nuisance model is correctly specified at each exposure time, despite the presence of high-dimensional covariates. Although the nuisance estimates themselves do not achieve root-$N$ rates, the carefully designed loss functions in our framework ensure final root-$N$ inference for the causal parameter of interest. Unlike off-the-shelf high-dimensional methods, which fail to deliver robust inference under model misspecification even within the doubly robust framework, our newly developed loss functions address this limitation effectively.

MEOct 10, 2021
High-dimensional Inference for Dynamic Treatment Effects

Jelena Bradic, Weijie Ji, Yuqian Zhang

Estimating dynamic treatment effects is a crucial endeavor in causal inference, particularly when confronted with high-dimensional confounders. Doubly robust (DR) approaches have emerged as promising tools for estimating treatment effects due to their flexibility. However, we showcase that the traditional DR approaches that only focus on the DR representation of the expected outcomes may fall short of delivering optimal results. In this paper, we propose a novel DR representation for intermediate conditional outcome models that leads to superior robustness guarantees. The proposed method achieves consistency even with high-dimensional confounders, as long as at least one nuisance function is appropriately parametrized for each exposure time and treatment path. Our results represent a significant step forward as they provide new robustness guarantees. The key to achieving these results is our new DR representation, which offers superior inferential performance while requiring weaker assumptions. Lastly, we confirm our findings in practice through simulations and a real data application.