LGFeb 24, 2022

SQuadMDS: a lean Stochastic Quartet MDS improving global structure preservation in neighbor embedding like t-SNE and UMAP

arXiv:2202.12087v111 citationsHas Code
Originality Incremental advance
AI Analysis

This work addresses the scalability issue in data visualization for large datasets, offering an incremental improvement by enhancing global structure preservation in existing neighbor embedding methods.

The paper tackles the problem of high computational complexity in multidimensional scaling (MDS) algorithms by introducing SQuadMDS, a stochastic, force-directed method with O(N) time and space complexity, which can be combined with neighbor embedding techniques like t-SNE to preserve both global and local data structures, showing competitive results that outperform state-of-the-art approaches.

Multidimensional scaling is a statistical process that aims to embed high dimensional data into a lower-dimensional space; this process is often used for the purpose of data visualisation. Common multidimensional scaling algorithms tend to have high computational complexities, making them inapplicable on large data sets. This work introduces a stochastic, force directed approach to multidimensional scaling with a time and space complexity of O(N), with N data points. The method can be combined with force directed layouts of the family of neighbour embedding such as t-SNE, to produce embeddings that preserve both the global and the local structures of the data. Experiments assess the quality of the embeddings produced by the standalone version and its hybrid extension both quantitatively and qualitatively, showing competitive results outperforming state-of-the-art approaches. Codes are available at https://github.com/PierreLambert3/SQuaD-MDS-and-FItSNE-hybrid.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes