ME CO MLJul 18, 2014

Extensions of stability selection using subsamples of observations and covariates

Andre Beinrucker, Ürün Dogan, Gilles Blanchard

arXiv:1407.4916v321 citations

Originality Incremental advance

AI Analysis

This work provides incremental improvements to variable selection methods for statisticians and data scientists, enhancing stability and performance in high-dimensional data analysis.

The paper tackles the problem of stabilizing variable selection methods by extending stability selection to use random subsamples of observations and covariates, generalizing theoretical results to arbitrary subsample sizes and validating improvements through numerical experiments on synthetic and real datasets.

We introduce extensions of stability selection, a method to stabilise variable selection methods introduced by Meinshausen and Bühlmann (J R Stat Soc 72:417-473, 2010). We propose to apply a base selection method repeatedly to random observation subsamples and covariate subsets under scrutiny, and to select covariates based on their selection frequency. We analyse the effects and benefits of these extensions. Our analysis generalizes the theoretical results of Meinshausen and Bühlmann (J R Stat Soc 72:417-473, 2010) from the case of half-samples to subsamples of arbitrary size. We study, in a theoretical manner, the effect of taking random covariate subsets using a simplified score model. Finally we validate these extensions on numerical experiments on both synthetic and real datasets, and compare the obtained results in detail to the original stability selection method.

View on arXiv PDF

Similar