LGAug 13, 2024

Optimal Bound for PCA with Outliers using Higher-Degree Voronoi Diagrams

Sajjad Hashemian, Mohammad Saeed Arvenaghi, Ebrahim Ardeshir-Larijani

arXiv:2408.06867v32.61 citationsh-index: 1

Originality Incremental advance

AI Analysis

This work addresses the challenge of robust PCA in the presence of outliers, which is incremental as it builds on existing PCA methods with computational geometry techniques.

The paper tackles the problem of Principal Component Analysis (PCA) with outliers by introducing new algorithms that achieve an optimal solution using higher-degree Voronoi diagrams and Grassmannian-based sampling, with time complexities of $n^{d+\mathcal{O}(1)}\text{poly}(n,d)$ and $2^{\mathcal{O}(r(d-r))} \times \text{poly}(n, d)$.

In this paper, we introduce new algorithms for Principal Component Analysis (PCA) with outliers. Utilizing techniques from computational geometry, specifically higher-degree Voronoi diagrams, we navigate to the optimal subspace for PCA even in the presence of outliers. This approach achieves an optimal solution with a time complexity of $n^{d+\mathcal{O}(1)}\text{poly}(n,d)$. Additionally, we present a randomized algorithm with a complexity of $2^{\mathcal{O}(r(d-r))} \times \text{poly}(n, d)$. This algorithm samples subspaces characterized in terms of a Grassmannian manifold. By employing such sampling method, we ensure a high likelihood of capturing the optimal subspace, with the success probability $(1 - δ)^T$. Where $δ$ represents the probability that a sampled subspace does not contain the optimal solution, and $T$ is the number of subspaces sampled, proportional to $2^{r(d-r)}$. Our use of higher-degree Voronoi diagrams and Grassmannian based sampling offers a clearer conceptual pathway and practical advantages, particularly in handling large datasets or higher-dimensional settings.

View on arXiv PDF

Similar