MLLGFeb 6, 2023

Random Forests for time-fixed and time-dependent predictors: The DynForest R package

arXiv:2302.02670v25 citationsh-index: 42
Originality Synthesis-oriented
AI Analysis

This provides a tool for researchers in fields like biostatistics or epidemiology who need to handle complex longitudinal data with random forests, though it is incremental as it builds on existing random forest and mixed model methods.

The authors tackled the problem of predicting continuous, categorical, or time-to-event outcomes using random forests that incorporate both time-fixed and time-dependent predictors, including those that are endogenous, error-prone, or measured at irregular times, by developing the DynForest R package, which achieves this through internal summarization of time-dependent predictors using linear mixed models.

The R package DynForest implements random forests for predicting a continuous, a categorical or a (multiple causes) time-to-event outcome based on time-fixed and time-dependent predictors. The main originality of DynForest is that it handles time-dependent predictors that can be endogeneous (i.e., impacted by the outcome process), measured with error and measured at subject-specific times. At each recursive step of the tree building process, the time-dependent predictors are internally summarized into individual features on which the split can be done. This is achieved using flexible linear mixed models (thanks to the R package lcmm) which specification is pre-specified by the user. DynForest returns the mean for continuous outcome, the category with a majority vote for categorical outcome or the cumulative incidence function over time for survival outcome. DynForest also computes variable importance and minimal depth to inform on the most predictive variables or groups of variables. This paper aims to guide the user with step-by-step examples for fitting random forests using DynForest.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes