ML LGJun 23, 2023

Two derivations of Principal Component Analysis on datasets of distributions

arXiv:2306.13503v12.31 citationsh-index: 18

Originality Synthesis-oriented

AI Analysis

This provides a theoretical extension of PCA for distributional data, which is incremental as it adapts existing PCA derivations to a new data type.

The paper tackles the problem of performing Principal Component Analysis (PCA) on datasets of distributions rather than points, deriving a closed-form solution from both variance-maximization and reconstruction-error minimization perspectives.

In this brief note, we formulate Principal Component Analysis (PCA) over datasets consisting not of points but of distributions, characterized by their location and covariance. Just like the usual PCA on points can be equivalently derived via a variance-maximization principle and via a minimization of reconstruction error, we derive a closed-form solution for distributional PCA from both of these perspectives.

View on arXiv PDF

Similar