MLJun 26, 2015

Principal Geodesic Analysis for Probability Measures under the Optimal Transport Metric

arXiv:1506.07944v276 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of dimensionality reduction for probability measures in machine learning and statistics, offering a novel geometric approach with potential applications in image analysis, though it is incremental in extending PCA concepts to optimal transport.

The paper tackles the problem of summarizing families of probability measures by finding principal geodesic curves under the optimal transport metric, adapting concepts from Euclidean PCA to this geometry. It proposes scalable algorithms using relaxed geodesics and regularized distances, demonstrating results on image datasets as shapes or color histograms.

Given a family of probability measures in P(X), the space of probability measures on a Hilbert space X, our goal in this paper is to highlight one ore more curves in P(X) that summarize efficiently that family. We propose to study this problem under the optimal transport (Wasserstein) geometry, using curves that are restricted to be geodesic segments under that metric. We show that concepts that play a key role in Euclidean PCA, such as data centering or orthogonality of principal directions, find a natural equivalent in the optimal transport geometry, using Wasserstein means and differential geometry. The implementation of these ideas is, however, computationally challenging. To achieve scalable algorithms that can handle thousands of measures, we propose to use a relaxed definition for geodesics and regularized optimal transport distances. The interest of our approach is demonstrated on images seen either as shapes or color histograms.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes