STITMLSep 8, 2021

Sharp regret bounds for empirical Bayes and compound decision problems

arXiv:2109.03943v229 citations
Originality Highly original
AI Analysis

This provides foundational theoretical insights into the limits of data-driven estimation in empirical Bayes, impacting statisticians and machine learning researchers by clarifying asymptotic regret scaling.

The paper establishes sharp regret bounds for empirical Bayes and compound decision problems in normal and Poisson models, showing optimal regret scales as Θ((log n / log log n)^2) and Θ(log^3 n) for Poisson with compactly supported and subexponential priors, and at least Ω((log n / log log n)^2) and Ω(log^2 n) for normal with compactly supported and subgaussian priors, resolving a long-standing conjecture.

We consider the classical problems of estimating the mean of an $n$-dimensional normally (with identity covariance matrix) or Poisson distributed vector under the squared loss. In a Bayesian setting the optimal estimator is given by the prior-dependent conditional mean. In a frequentist setting various shrinkage methods were developed over the last century. The framework of empirical Bayes, put forth by Robbins (1956), combines Bayesian and frequentist mindsets by postulating that the parameters are independent but with an unknown prior and aims to use a fully data-driven estimator to compete with the Bayesian oracle that knows the true prior. The central figure of merit is the regret, namely, the total excess risk over the Bayes risk in the worst case (over the priors). Although this paradigm was introduced more than 60 years ago, little is known about the asymptotic scaling of the optimal regret in the nonparametric setting. We show that for the Poisson model with compactly supported and subexponential priors, the optimal regret scales as $Θ((\frac{\log n}{\log\log n})^2)$ and $Θ(\log^3 n)$, respectively, both attained by the original estimator of Robbins. For the normal mean model, the regret is shown to be at least $Ω((\frac{\log n}{\log\log n})^2)$ and $Ω(\log^2 n)$ for compactly supported and subgaussian priors, respectively, the former of which resolves the conjecture of Singh (1979) on the impossibility of achieving bounded regret; before this work, the best regret lower bound was $Ω(1)$. In addition to the empirical Bayes setting, these results are shown to hold in the compound setting where the parameters are deterministic. As a side application, the construction in this paper also leads to improved or new lower bounds for density estimation of Gaussian and Poisson mixtures.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes