MLLGJul 29, 2025

MIBoost: A Gradient Boosting Algorithm for Variable Selection After Multiple Imputation

arXiv:2507.21807v4h-index: 7
Originality Incremental advance
AI Analysis

This addresses a practical issue for statisticians and data analysts dealing with missing data in predictive modeling, though it is incremental as it extends existing principles to gradient boosting.

The paper tackles the problem of variable selection in the presence of missing data after multiple imputation, proposing MIBoost, a gradient boosting algorithm that unifies variable selection across imputed datasets, with simulation studies showing prediction performance comparable to recent methods.

Statistical learning methods for automated variable selection, such as LASSO, elastic nets, or gradient boosting, have become increasingly popular tools for building powerful prediction models. Yet, in practice, analyses are often complicated by missing data. The most widely used approach to address missingness is multiple imputation, which involves creating several completed datasets. However, there is an ongoing debate on how to perform model selection in the presence of multiple imputed datasets. Simple strategies, such as pooling models across datasets, have been shown to have suboptimal properties. Although more sophisticated methods exist, they are often difficult to implement and therefore not widely applied. In contrast, two recent approaches modify the regularization methods LASSO and elastic nets by defining a single loss function, resulting in a unified set of coefficients across imputations. Our key contribution is to extend this principle to the framework of component-wise gradient boosting by proposing MIBoost, a novel algorithm that employs a uniform variable-selection mechanism across imputed datasets. Simulation studies suggest that our approach yields prediction performance comparable to that of these recently proposed methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes