STCVJan 31, 2018

An Infinitesimal Probabilistic Model for Principal Component Analysis of Manifold Valued Data

arXiv:1801.10341v21 citations
Originality Highly original
AI Analysis

This work addresses the challenge of statistical analysis for data on manifolds, which is incremental as it builds on probabilistic PCA but introduces novel geometric and stochastic elements.

The authors tackled the problem of generalizing principal component analysis (PCA) to nonlinear manifold-valued data by developing a probabilistic and infinitesimal model that avoids linearization, using stochastic development and principal fiber bundles to handle global transport and curvature effects. They provided estimation procedures and demonstrated the model's properties on embedded surfaces.

We provide a probabilistic and infinitesimal view of how the principal component analysis procedure (PCA) can be generalized to analysis of nonlinear manifold valued data. Starting with the probabilistic PCA interpretation of the Euclidean PCA procedure, we show how PCA can be generalized to manifolds in an intrinsic way that does not resort to linearization of the data space. The underlying probability model is constructed by mapping a Euclidean stochastic process to the manifold using stochastic development of Euclidean semimartingales. The construction uses a connection and bundles of covariant tensors to allow global transport of principal eigenvectors, and the model is thereby an example of how principal fiber bundles can be used to handle the lack of global coordinate system and orientations that characterizes manifold valued statistics. We show how curvature implies non-integrability of the equivalent of Euclidean principal subspaces, and how the stochastic flows provide an alternative to explicit construction of such subspaces. We describe estimation procedures for inference of parameters and prediction of principal components, and we give examples of properties of the model on embedded surfaces.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes