Rethinking PCA Through Duality
This work offers incremental improvements in PCA methodology, potentially benefiting machine learning practitioners by enhancing optimization and robustness in dimensionality reduction.
The paper revisits principal component analysis (PCA) by connecting it to self-attention and the difference-of-convex framework, providing new theoretical insights and algorithms, including a kernelizable dual formulation for robust PCA with l1 deviation minimization, and empirically compares them with state-of-the-art methods.
Motivated by the recently shown connection between self-attention and (kernel) principal component analysis (PCA), we revisit the fundamentals of PCA. Using the difference-of-convex (DC) framework, we present several novel formulations and provide new theoretical insights. In particular, we show the kernelizability and out-of-sample applicability for a PCA-like family of problems. Moreover, we uncover that simultaneous iteration, which is connected to the classical QR algorithm, is an instance of the difference-of-convex algorithm (DCA), offering an optimization perspective on this longstanding method. Further, we describe new algorithms for PCA and empirically compare them with state-of-the-art methods. Lastly, we introduce a kernelizable dual formulation for a robust variant of PCA that minimizes the $l_1$ deviation of the reconstruction errors.