Generative Principal Component Analysis
This work addresses PCA problems like spiked matrix recovery and phase retrieval for data analysis, offering a novel method with theoretical guarantees and experimental gains, though it is incremental in combining generative models with existing PCA techniques.
The paper tackles principal component analysis under generative modeling assumptions, showing that their proposed quadratic estimator achieves a statistical rate of order √(k log L / m) and providing a variant of the power method that converges exponentially fast to this rate.
In this paper, we study the problem of principal component analysis with generative modeling assumptions, adopting a general model for the observed matrix that encompasses notable special cases, including spiked matrix recovery and phase retrieval. The key assumption is that the underlying signal lies near the range of an $L$-Lipschitz continuous generative model with bounded $k$-dimensional inputs. We propose a quadratic estimator, and show that it enjoys a statistical rate of order $\sqrt{\frac{k\log L}{m}}$, where $m$ is the number of samples. We also provide a near-matching algorithm-independent lower bound. Moreover, we provide a variant of the classic power method, which projects the calculated data onto the range of the generative model during each iteration. We show that under suitable conditions, this method converges exponentially fast to a point achieving the above-mentioned statistical rate. We perform experiments on various image datasets for spiked matrix and phase retrieval models, and illustrate performance gains of our method to the classic power method and the truncated power method devised for sparse principal component analysis.