MEEMMLOct 5, 2016

Generalized Random Forests

arXiv:1610.01271v41696 citations
AI Analysis

This provides a flexible tool for statisticians and data scientists to handle complex estimation tasks like quantile regression and treatment effect estimation, though it builds incrementally on existing random forest techniques.

The authors tackled the problem of non-parametric statistical estimation by proposing generalized random forests, a method that adapts random forests to fit any quantity of interest from local moment equations, resulting in consistent and asymptotically Gaussian estimates with valid confidence intervals.

We propose generalized random forests, a method for non-parametric statistical estimation based on random forests (Breiman, 2001) that can be used to fit any quantity of interest identified as the solution to a set of local moment equations. Following the literature on local maximum likelihood estimation, our method considers a weighted set of nearby training examples; however, instead of using classical kernel weighting functions that are prone to a strong curse of dimensionality, we use an adaptive weighting function derived from a forest designed to express heterogeneity in the specified quantity of interest. We propose a flexible, computationally efficient algorithm for growing generalized random forests, develop a large sample theory for our method showing that our estimates are consistent and asymptotically Gaussian, and provide an estimator for their asymptotic variance that enables valid confidence intervals. We use our approach to develop new methods for three statistical tasks: non-parametric quantile regression, conditional average partial effect estimation, and heterogeneous treatment effect estimation via instrumental variables. A software implementation, grf for R and C++, is available from CRAN.

Code Implementations5 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes