Risk of the Least Squares Minimum Norm Estimator under the Spike Covariance Model
This provides theoretical guarantees for high-dimensional linear regression with structured covariance, which is incremental but important for statistical theory.
The paper tackles the risk analysis of the minimum norm least squares estimator when the number of parameters grows faster than sample size, assuming a spike covariance model with low-rank structure. It shows that the estimator's risk vanishes compared to the null estimator and provides improved asymptotic and non-asymptotic bounds.
We study risk of the minimum norm linear least squares estimator in when the number of parameters $d$ depends on $n$, and $\frac{d}{n} \rightarrow \infty$. We assume that data has an underlying low rank structure by restricting ourselves to spike covariance matrices, where a fixed finite number of eigenvalues grow with $n$ and are much larger than the rest of the eigenvalues, which are (asymptotically) in the same order. We show that in this setting risk of minimum norm least squares estimator vanishes in compare to risk of the null estimator. We give asymptotic and non asymptotic upper bounds for this risk, and also leverage the assumption of spike model to give an analysis of the bias that leads to tighter bounds in compare to previous works.