LGHCMLMay 3, 2019

Uncertainty-Aware Principal Component Analysis

arXiv:1905.01127v437 citations
Originality Incremental advance
AI Analysis

This work addresses the need for linear dimensionality reduction techniques that account for data uncertainty, which is incremental as it builds upon traditional PCA.

The paper tackles the problem of dimensionality reduction for uncertain data by generalizing PCA to handle multivariate probability distributions, resulting in a method that maintains distribution characteristics after projection and enables sensitivity analysis through novel visualizations.

We present a technique to perform dimensionality reduction on data that is subject to uncertainty. Our method is a generalization of traditional principal component analysis (PCA) to multivariate probability distributions. In comparison to non-linear methods, linear dimensionality reduction techniques have the advantage that the characteristics of such probability distributions remain intact after projection. We derive a representation of the PCA sample covariance matrix that respects potential uncertainty in each of the inputs, building the mathematical foundation of our new method: uncertainty-aware PCA. In addition to the accuracy and performance gained by our approach over sampling-based strategies, our formulation allows us to perform sensitivity analysis with regard to the uncertainty in the data. For this, we propose factor traces as a novel visualization that enables to better understand the influence of uncertainty on the chosen principal components. We provide multiple examples of our technique using real-world datasets. As a special case, we show how to propagate multivariate normal distributions through PCA in closed form. Furthermore, we discuss extensions and limitations of our approach.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes