ML LGSep 3, 2015

Bayesian Masking: Sparse Bayesian Estimation with Weaker Shrinkage Bias

Yohei Kondo, Kohei Hayashi, Shin-ichi Maeda

arXiv:1509.01004v22.81 citations

Originality Incremental advance

AI Analysis

This addresses the issue of incorrect feature selection due to shrinkage bias in sparse estimation, which is incremental as it builds on existing methods like Lasso and ARD.

The paper tackles the problem of shrinkage bias in sparse linear regression by proposing Bayesian Masking (BM), a method that introduces binary latent variables to mask features without regularization, resulting in improved sparsity-shrinkage trade-off compared to Lasso and ARD.

A common strategy for sparse linear regression is to introduce regularization, which eliminates irrelevant features by letting the corresponding weights be zeros. However, regularization often shrinks the estimator for relevant features, which leads to incorrect feature selection. Motivated by the above-mentioned issue, we propose Bayesian masking (BM), a sparse estimation method which imposes no regularization on the weights. The key concept of BM is to introduce binary latent variables that randomly mask features. Estimating the masking rates determines the relevance of the features automatically. We derive a variational Bayesian inference algorithm that maximizes the lower bound of the factorized information criterion (FIC), which is a recently developed asymptotic criterion for evaluating the marginal log-likelihood. In addition, we propose reparametrization to accelerate the convergence of the derived algorithm. Finally, we show that BM outperforms Lasso and automatic relevance determination (ARD) in terms of the sparsity-shrinkage trade-off.

View on arXiv PDF

Similar