Autoencoded UMAP-Enhanced Clustering for Unsupervised Learning
This work addresses unsupervised clustering for data analysis, but it appears incremental as it integrates existing methods like autoencoders and UMAP into a hybrid framework.
The authors tackled the problem of unsupervised learning by proposing Autoencoded UMAP-Enhanced Clustering (AUEC), a three-stage framework that combines autoencoder and UMAP embeddings with a clustering-promoting loss, resulting in significantly outperforming state-of-the-art techniques on MNIST data in terms of clustering accuracy.
We propose a novel approach to unsupervised learning by constructing a non-linear embedding of the data into a low-dimensional space followed by any conventional clustering algorithm. The embedding promotes clusterability of the data and is comprised of two mappings: the encoder of an autoencoder neural network and the output of UMAP algorithm. The autoencoder is trained with a composite loss function that incorporates both a conventional data reconstruction as a regularization component and a clustering-promoting component built using the spectral graph theory. The two embeddings and the subsequent clustering are integrated into a three-stage unsupervised learning framework, referred to as Autoencoded UMAP-Enhanced Clustering (AUEC). When applied to MNIST data, AUEC significantly outperforms the state-of-the-art techniques in terms of clustering accuracy.