MLLGMar 9, 2023

Entropic Wasserstein Component Analysis

arXiv:2303.05119v12 citationsh-index: 33
Originality Incremental advance
AI Analysis

This is an incremental improvement for researchers and practitioners in machine learning needing better dimension reduction techniques.

The paper tackles the problem of dimension reduction by combining optimal transport and PCA to preserve global dependencies and clusters in embeddings, resulting in more interpretable and effective embeddings.

Dimension reduction (DR) methods provide systematic approaches for analyzing high-dimensional data. A key requirement for DR is to incorporate global dependencies among original and embedded samples while preserving clusters in the embedding space. To achieve this, we combine the principles of optimal transport (OT) and principal component analysis (PCA). Our method seeks the best linear subspace that minimizes reconstruction error using entropic OT, which naturally encodes the neighborhood information of the samples. From an algorithmic standpoint, we propose an efficient block-majorization-minimization solver over the Stiefel manifold. Our experimental results demonstrate that our approach can effectively preserve high-dimensional clusters, leading to more interpretable and effective embeddings. Python code of the algorithms and experiments is available online.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes