STLGJan 8, 2018

Log-Scale Shrinkage Priors and Adaptive Bayesian Global-Local Shrinkage Estimation

arXiv:1801.02321v23 citations
Originality Incremental advance
AI Analysis

This work addresses Bayesian estimation challenges in sparse models, offering an incremental improvement with adaptive procedures for statisticians and data scientists.

The paper tackles the problem of Bayesian shrinkage estimation by proposing log-scale distributions as priors for local shrinkage hyperparameters, introducing a new log-t prior that achieves super-efficiency and adapts well to varying sparsity and signal-to-noise ratios in simulations.

Global-local shrinkage hierarchies are an important innovation in Bayesian estimation. We propose the use of log-scale distributions as a novel basis for generating familes of prior distributions for local shrinkage hyperparameters. By varying the scale parameter one may vary the degree to which the prior distribution promotes sparsity in the coefficient estimates. By examining the class of distributions over the logarithm of the local shrinkage parameter that have log-linear, or sub-log-linear tails, we show that many standard prior distributions for local shrinkage parameters can be unified in terms of the tail behaviour and concentration properties of their corresponding marginal distributions over the coefficients $β_j$. We derive upper bounds on the rate of concentration around $|β_j|=0$, and the tail decay as $|β_j| \to \infty$, achievable by this wide class of prior distributions. We then propose a new type of ultra-heavy tailed prior, called the log-$t$ prior with the property that, irrespective of the choice of associated scale parameter, the marginal distribution always diverges at $β_j = 0$, and always possesses super-Cauchy tails. We develop results demonstrating when prior distributions with (sub)-log-linear tails attain Kullback--Leibler super-efficiency and prove that the log-$t$ prior distribution is always super-efficient. We show that the log-$t$ prior is less sensitive to misspecification of the global shrinkage parameter than the horseshoe or lasso priors. By incorporating the scale parameter of the log-scale prior distributions into the Bayesian hierarchy we derive novel adaptive shrinkage procedures. Simulations show that the adaptive log-$t$ procedure appears to always perform well, irrespective of the level of sparsity or signal-to-noise ratio of the underlying model.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes