Variable selection for Gaussian processes via sensitivity analysis of the posterior predictive distribution
This work addresses variable selection for Gaussian process models, offering incremental improvements over existing methods like automatic relevance determination.
The authors tackled the problem of variable selection in Gaussian process models by proposing two new methods that rank variables based on predictive relevance using predictions near training points, resulting in improved variable selection with better variability and predictive performance compared to automatic relevance determination.
Variable selection for Gaussian process models is often done using automatic relevance determination, which uses the inverse length-scale parameter of each input variable as a proxy for variable relevance. This implicitly determined relevance has several drawbacks that prevent the selection of optimal input variables in terms of predictive performance. To improve on this, we propose two novel variable selection methods for Gaussian process models that utilize the predictions of a full model in the vicinity of the training points and thereby rank the variables based on their predictive relevance. Our empirical results on synthetic and real world data sets demonstrate improved variable selection compared to automatic relevance determination in terms of variability and predictive performance.