STMEMLAug 14, 2013

Confidence Sets Based on Thresholding Estimators in High-Dimensional Gaussian Regression Models

arXiv:1308.3201v210 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of reliable inference in high-dimensional statistics for researchers and practitioners, but it is incremental as it builds on existing thresholding methods.

The paper tackles the problem of constructing confidence intervals for high-dimensional linear regression using thresholding estimators, showing that these intervals are always larger than standard least-squares intervals in finite samples and can be asymptotically larger by an order of magnitude when tuned for consistent variable selection.

We study confidence intervals based on hard-thresholding, soft-thresholding, and adaptive soft-thresholding in a linear regression model where the number of regressors $k$ may depend on and diverge with sample size $n$. In addition to the case of known error variance, we define and study versions of the estimators when the error variance is unknown. In the known variance case, we provide an exact analysis of the coverage properties of such intervals in finite samples. We show that these intervals are always larger than the standard interval based on the least-squares estimator. Asymptotically, the intervals based on the thresholding estimators are larger even by an order of magnitude when the estimators are tuned to perform consistent variable selection. For the unknown-variance case, we provide non-trivial lower bounds for the coverage probabilities in finite samples and conduct an asymptotic analysis where the results from the known-variance case can be shown to carry over asymptotically if the number of degrees of freedom $n-k$ tends to infinity fast enough in relation to the thresholding parameter.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes