Fast Bayesian Variable Selection in Binomial and Negative Binomial Regression
This provides a faster method for variable selection in count data models, which are widely used in fields like biology and economics, but it appears incremental as it builds on existing sampling techniques.
The paper tackles the computational challenges of Bayesian variable selection in binomial and negative binomial regression by introducing an efficient MCMC scheme based on Tempered Gibbs Sampling, demonstrating effectiveness on cancer data with 17,000 covariates.
Bayesian variable selection is a powerful tool for data analysis, as it offers a principled method for variable selection that accounts for prior information and uncertainty. However, wider adoption of Bayesian variable selection has been hampered by computational challenges, especially in difficult regimes with a large number of covariates or non-conjugate likelihoods. Generalized linear models for count data, which are prevalent in biology, ecology, economics, and beyond, represent an important special case. Here we introduce an efficient MCMC scheme for variable selection in binomial and negative binomial regression that exploits Tempered Gibbs Sampling (Zanella and Roberts, 2019) and that includes logistic regression as a special case. In experiments we demonstrate the effectiveness of our approach, including on cancer data with seventeen thousand covariates.