ME MLJun 25, 2021

Feature Grouping and Sparse Principal Component Analysis with Truncated Regularization

Haiyan Jiang, Shanshan Qin, Oscar Hernan Madrid Padilla

arXiv:2106.13685v21.2h-index: 30Has Code

Originality Incremental advance

AI Analysis

This work addresses the need for more interpretable and efficient PCA methods in data analysis, though it is incremental as it builds on existing structured PCA approaches.

The paper tackles the problem of capturing grouping and sparse structures in factor loadings for principal component analysis (PCA) by proposing the FGSPCA method with truncated regularization, which reduces model complexity and improves interpretation without prior knowledge, and experimental results show promising performance and efficiency on synthetic and real-world datasets.

In this paper, we consider a new variant for principal component analysis (PCA), aiming to capture the grouping and/or sparse structures of factor loadings simultaneously. To achieve these goals, we employ a non-convex truncated regularization with naturally adjustable sparsity and grouping effects, and propose the Feature Grouping and Sparse Principal Component Analysis (FGSPCA). The proposed FGSPCA method encourages the factor loadings with similar values to collapse into disjoint homogeneous groups for feature grouping or into a special zero-valued group for feature selection, which in turn helps reducing model complexity and increasing model interpretation. Usually, existing structured PCA methods require prior knowledge to construct the regularization term. However, the proposed FGSPCA can simultaneously capture the grouping and/or sparse structures of factor loadings without any prior information. To solve the resulting non-convex optimization problem, we propose an alternating algorithm that incorporates the difference-of-convex programming, augmented Lagrange method and coordinate descent method. Experimental results demonstrate the promising performance and efficiency of the new method on both synthetic and real-world datasets. An R implementation of FGSPCA can be found on github {https://github.com/higeeks/FGSPCA}.

View on arXiv PDF Code

Similar