LGDec 19, 2024

Dimension Reduction with Locally Adjusted Graphs

Yingfan Wang, Yiyang Sun, Haiyang Huang, Cynthia Rudin

arXiv:2412.15426v213.413 citationsh-index: 6Has CodeAAAI

Originality Incremental advance

AI Analysis

This work addresses a specific bottleneck in DR for biological data analysis, offering incremental improvements in cluster detection for real-world applications.

The paper tackled the problem of unreliable graphs in dimension reduction (DR) algorithms for high-dimensional datasets, such as transcriptomic data, by introducing LocalMAP, a method that dynamically adjusts graphs to improve cluster separability, resulting in more accurate identification of real clusters compared to other DR methods.

Dimension reduction (DR) algorithms have proven to be extremely useful for gaining insight into large-scale high-dimensional datasets, particularly finding clusters in transcriptomic data. The initial phase of these DR methods often involves converting the original high-dimensional data into a graph. In this graph, each edge represents the similarity or dissimilarity between pairs of data points. However, this graph is frequently suboptimal due to unreliable high-dimensional distances and the limited information extracted from the high-dimensional data. This problem is exacerbated as the dataset size increases. If we reduce the size of the dataset by selecting points for a specific sections of the embeddings, the clusters observed through DR are more separable since the extracted subgraphs are more reliable. In this paper, we introduce LocalMAP, a new dimensionality reduction algorithm that dynamically and locally adjusts the graph to address this challenge. By dynamically extracting subgraphs and updating the graph on-the-fly, LocalMAP is capable of identifying and separating real clusters within the data that other DR methods may overlook or combine. We demonstrate the benefits of LocalMAP through a case study on biological datasets, highlighting its utility in helping users more accurately identify clusters for real-world problems.

View on arXiv PDF Code

Similar