LGJan 13, 2025

Autoencoded UMAP-Enhanced Clustering for Unsupervised Learning

arXiv:2501.07729v21 citationsh-index: 3
Originality Synthesis-oriented
AI Analysis

This work addresses unsupervised clustering for data analysis, but it appears incremental as it integrates existing methods like autoencoders and UMAP into a hybrid framework.

The authors tackled the problem of unsupervised learning by proposing Autoencoded UMAP-Enhanced Clustering (AUEC), a three-stage framework that combines autoencoder and UMAP embeddings with a clustering-promoting loss, resulting in significantly outperforming state-of-the-art techniques on MNIST data in terms of clustering accuracy.

We propose a novel approach to unsupervised learning by constructing a non-linear embedding of the data into a low-dimensional space followed by any conventional clustering algorithm. The embedding promotes clusterability of the data and is comprised of two mappings: the encoder of an autoencoder neural network and the output of UMAP algorithm. The autoencoder is trained with a composite loss function that incorporates both a conventional data reconstruction as a regularization component and a clustering-promoting component built using the spectral graph theory. The two embeddings and the subsequent clustering are integrated into a three-stage unsupervised learning framework, referred to as Autoencoded UMAP-Enhanced Clustering (AUEC). When applied to MNIST data, AUEC significantly outperforms the state-of-the-art techniques in terms of clustering accuracy.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes