LG AI OC MLMar 1, 2025

Hidden Convexity of Fair PCA and Fast Solver via Eigenvalue Optimization

Junhui Shen, Aaron J. Davis, Ding Lu, Zhaojun Bai

arXiv:2503.00299v13 citationsh-index: 7

Originality Incremental advance

AI Analysis

This work addresses the bias issue in PCA for high-dimensional datasets, providing a faster solution for fair dimensionality reduction, though it is incremental as it builds on existing FPCA models.

The paper tackles the computational inefficiency of Fair PCA (FPCA) by identifying hidden convexity in the model and introducing an algorithm based on eigenvalue optimization, achieving an 8x speedup over the SDR-based method while maintaining fairness in reconstruction loss.

Principal Component Analysis (PCA) is a foundational technique in machine learning for dimensionality reduction of high-dimensional datasets. However, PCA could lead to biased outcomes that disadvantage certain subgroups of the underlying datasets. To address the bias issue, a Fair PCA (FPCA) model was introduced by Samadi et al. (2018) for equalizing the reconstruction loss between subgroups. The semidefinite relaxation (SDR) based approach proposed by Samadi et al. (2018) is computationally expensive even for suboptimal solutions. To improve efficiency, several alternative variants of the FPCA model have been developed. These variants often shift the focus away from equalizing the reconstruction loss. In this paper, we identify a hidden convexity in the FPCA model and introduce an algorithm for convex optimization via eigenvalue optimization. Our approach achieves the desired fairness in reconstruction loss without sacrificing performance. As demonstrated in real-world datasets, the proposed FPCA algorithm runs $8\times$ faster than the SDR-based algorithm, and only at most 85% slower than the standard PCA.

View on arXiv PDF

Similar