Learning Topology-Preserving Data Representations
This addresses the challenge of maintaining global data structure in dimensionality reduction for applications in data analysis and visualization, representing an incremental improvement over existing methods.
The paper tackles the problem of learning data representations that preserve topological features like clusters and loops, proposing a method called RTD-AE that minimizes Representation Topology Divergence (RTD) to ensure topological similarity between high-dimensional data and low-dimensional representations, with results showing it outperforms state-of-the-art competitors in metrics such as linear correlation, triplet distance ranking accuracy, and Wasserstein distance between persistence barcodes.
We propose a method for learning topology-preserving data representations (dimensionality reduction). The method aims to provide topological similarity between the data manifold and its latent representation via enforcing the similarity in topological features (clusters, loops, 2D voids, etc.) and their localization. The core of the method is the minimization of the Representation Topology Divergence (RTD) between original high-dimensional data and low-dimensional representation in latent space. RTD minimization provides closeness in topological features with strong theoretical guarantees. We develop a scheme for RTD differentiation and apply it as a loss term for the autoencoder. The proposed method "RTD-AE" better preserves the global structure and topology of the data manifold than state-of-the-art competitors as measured by linear correlation, triplet distance ranking accuracy, and Wasserstein distance between persistence barcodes.