CVOct 1, 2021

Visual Cluster Separation Using High-Dimensional Sharpened Dimensionality Reduction

arXiv:2110.00317v21 citations
Originality Incremental advance
AI Analysis

This addresses the challenge for end-users in exploratory data analysis to distinguish clusters in unlabeled datasets, though it is incremental as it builds on existing DR methods.

The paper tackles the problem of poor cluster separation in 2D projections from dimensionality reduction (DR) by sharpening clusters in high-dimensional data before DR, resulting in better visual cluster separation and good quality metrics on synthetic and real-world datasets.

Applying dimensionality reduction (DR) to large, high-dimensional data sets can be challenging when distinguishing the underlying high-dimensional data clusters in a 2D projection for exploratory analysis. We address this problem by first sharpening the clusters in the original high-dimensional data prior to the DR step using Local Gradient Clustering (LGC). We then project the sharpened data from the high-dimensional space to 2D by a user-selected DR method. The sharpening step aids this method to preserve cluster separation in the resulting 2D projection. With our method, end-users can label each distinct cluster to further analyze an otherwise unlabeled data set. Our `High-Dimensional Sharpened DR' (HD-SDR) method, tested on both synthetic and real-world data sets, is favorable to DR methods with poor cluster separation and yields a better visual cluster separation than these DR methods with no sharpening. Our method achieves good quality (measured by quality metrics) and scales computationally well with large high-dimensional data. To illustrate its concrete applications, we further apply HD-SDR on a recent astronomical catalog.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes