MLLGJan 11, 2024

A tree-based varying coefficient model

arXiv:2401.05982v31 citationsh-index: 2Comput Stat
Originality Synthesis-oriented
AI Analysis

This is an incremental improvement for statistical modeling and machine learning practitioners, offering interpretability and feature selection in varying coefficient models.

The paper tackles the problem of modeling varying coefficients by introducing a tree-based varying coefficient model using cyclic gradient boosting, achieving out-of-sample loss results comparable to a neural network-based VCM on simulated and real data.

The paper introduces a tree-based varying coefficient model (VCM) where the varying coefficients are modelled using the cyclic gradient boosting machine (CGBM) from Delong et al. (2023). Modelling the coefficient functions using a CGBM allows for dimension-wise early stopping and feature importance scores. The dimension-wise early stopping not only reduces the risk of dimension-specific overfitting, but also reveals differences in model complexity across dimensions. The use of feature importance scores allows for simple feature selection and easy model interpretation. The model is evaluated on the same simulated and real data examples as those used in Richman and Wüthrich (2023), and the results show that it produces results in terms of out of sample loss that are comparable to those of their neural network-based VCM called LocalGLMnet.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes