Optimal Structured Principal Subspace Estimation: Metric Entropy and Minimax Rates
This work addresses a unified statistical framework for structured subspace estimation, which is incremental as it generalizes and refines existing results for applications in machine learning and data analysis.
The paper tackles the problem of estimating principal subspaces under various structural constraints, establishing general minimax bounds that reveal phase transitions and providing optimal rates for specific cases like non-negative PCA/SVD.
Driven by a wide range of applications, many principal subspace estimation problems have been studied individually under different structural constraints. This paper presents a unified framework for the statistical analysis of a general structured principal subspace estimation problem which includes as special cases non-negative PCA/SVD, sparse PCA/SVD, subspace constrained PCA/SVD, and spectral clustering. General minimax lower and upper bounds are established to characterize the interplay between the information-geometric complexity of the structural set for the principal subspaces, the signal-to-noise ratio (SNR), and the dimensionality. The results yield interesting phase transition phenomena concerning the rates of convergence as a function of the SNRs and the fundamental limit for consistent estimation. Applying the general results to the specific settings yields the minimax rates of convergence for those problems, including the previous unknown optimal rates for non-negative PCA/SVD, sparse SVD and subspace constrained PCA/SVD.