LG MLOct 7, 2021

AgFlow: Fast Model Selection of Penalized PCA via Implicit Regularization Effects of Gradient Flow

Haiyan Jiang, Haoyi Xiong, Dongrui Wu, Ji Liu, Dejing Dou

arXiv:2110.03273v11.6

Originality Incremental advance

AI Analysis

This work addresses a computational bottleneck for researchers and practitioners using penalized PCA in high-dimensional settings, though it is incremental as it builds on prior methods.

The paper tackles the computational inefficiency of model selection in penalized PCA for high-dimensional data by proposing AgFlow, which reduces computation time by leveraging implicit regularization from gradient flow, achieving faster performance than existing methods.

Principal component analysis (PCA) has been widely used as an effective technique for feature extraction and dimension reduction. In the High Dimension Low Sample Size (HDLSS) setting, one may prefer modified principal components, with penalized loadings, and automated penalty selection by implementing model selection among these different models with varying penalties. The earlier work [1, 2] has proposed penalized PCA, indicating the feasibility of model selection in $L_2$- penalized PCA through the solution path of Ridge regression, however, it is extremely time-consuming because of the intensive calculation of matrix inverse. In this paper, we propose a fast model selection method for penalized PCA, named Approximated Gradient Flow (AgFlow), which lowers the computation complexity through incorporating the implicit regularization effect introduced by (stochastic) gradient flow [3, 4] and obtains the complete solution path of $L_2$-penalized PCA under varying $L_2$-regularization. We perform extensive experiments on real-world datasets. AgFlow outperforms existing methods (Oja [5], Power [6], and Shamir [7] and the vanilla Ridge estimators) in terms of computation costs.

View on arXiv PDF

Similar