Accelerating UMAP for Large-Scale Datasets Through Spectral Coarsening
This work addresses the scalability issue of UMAP for researchers and practitioners handling large-scale data, representing an incremental improvement through a novel compression technique.
The paper tackles the problem of UMAP's computational inefficiency for large datasets by introducing a spectral coarsening method that compresses data while preserving manifold structure, achieving faster performance with maintained embedding quality, as demonstrated on the USPS dataset.
This paper introduces an innovative approach to dramatically accelerate UMAP using spectral data compression.The proposed method significantly reduces the size of the dataset, preserving its essential manifold structure through an advanced spectral compression technique. This allows UMAP to perform much faster while maintaining the quality of its embeddings. Experiments on real-world datasets, such as USPS, demonstrate the method's ability to achieve substantial data reduction without compromising embedding fidelity.