A Robust Optimization Approach to Sparse Principal Component Analysis
This work provides a novel, tuning-free approach to sparse PCA for practitioners in high-dimensional data analysis, though the improvements over existing methods are not quantified with specific numbers.
AdvPCA introduces a robust optimization framework for sparse PCA that avoids explicit sparsity penalties, achieving interpretable low-dimensional representations with a data-adaptive parameterization. Experiments on synthetic and genomics data demonstrate its effectiveness.
While principal component analysis (PCA) is a fundamental tool for dimensionality reduction, its dense representations make it ill-suited for high-dimensional data. Existing methods address this by promoting sparsity through explicit $\ell_1$-penalties, but these are not obvious to tune due to the unsupervised nature of the task. In contrast, we propose Adversarial PCA (AdvPCA), which leverages robust optimization to achieve sparsity by optimizing the reconstruction objective against bounded, worst-case latent space perturbations. We show that this formulation admits a closed-form reduction, leading to a practical iterative algorithm that alternates between adversarial linear regression-style updates for the sparse encoder and orthogonal updates for the decoder. By theoretically characterizing the solution, we derive a data-adaptive parameterization that allows the algorithm to perform effectively out of the box. We validate these claims through numerical experiments on synthetic and real-world genomics data.