STITLGMLApr 7, 2021

Minimax Estimation of Linear Functions of Eigenvectors in the Face of Small Eigen-Gaps

arXiv:2104.03298v22 citations
Originality Highly original
AI Analysis

This addresses a critical limitation in eigenvector perturbation theory for data science applications where fine-grained eigenvector behavior is needed, offering improved statistical guarantees for tasks like matrix denoising and PCA.

The paper tackles the problem of estimating linear functions of eigenvectors when eigen-gaps are small, developing de-biased estimators for matrix denoising and PCA under Gaussian noise that achieve minimax optimality (modulo logarithmic factors) without sample splitting.

Eigenvector perturbation analysis plays a vital role in various data science applications. A large body of prior works, however, focused on establishing $\ell_{2}$ eigenvector perturbation bounds, which are often highly inadequate in addressing tasks that rely on fine-grained behavior of an eigenvector. This paper makes progress on this by studying the perturbation of linear functions of an unknown eigenvector. Focusing on two fundamental problems -- matrix denoising and principal component analysis -- in the presence of Gaussian noise, we develop a suite of statistical theory that characterizes the perturbation of arbitrary linear functions of an unknown eigenvector. In order to mitigate a non-negligible bias issue inherent to the natural ``plug-in'' estimator, we develop de-biased estimators that (1) achieve minimax lower bounds for a family of scenarios (modulo some logarithmic factor), and (2) can be computed in a data-driven manner without sample splitting. Noteworthily, the proposed estimators are nearly minimax optimal even when the associated eigen-gap is {\em substantially smaller} than what is required in prior statistical theory.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes