LGCVMar 5, 2018

Deep Continuous Clustering

arXiv:1803.01449v180 citations
Originality Highly original
AI Analysis

This addresses the challenge of clustering in high-dimensional spaces for data analysis applications, representing a novel approach rather than an incremental improvement.

The paper tackles the problem of clustering high-dimensional datasets by jointly performing nonlinear dimensionality reduction and clustering through a deep autoencoder optimized as part of the clustering process, resulting in an algorithm that outperforms state-of-the-art methods across multiple domains.

Clustering high-dimensional datasets is hard because interpoint distances become less informative in high-dimensional spaces. We present a clustering algorithm that performs nonlinear dimensionality reduction and clustering jointly. The data is embedded into a lower-dimensional space by a deep autoencoder. The autoencoder is optimized as part of the clustering process. The resulting network produces clustered data. The presented approach does not rely on prior knowledge of the number of ground-truth clusters. Joint nonlinear dimensionality reduction and clustering are formulated as optimization of a global continuous objective. We thus avoid discrete reconfigurations of the objective that characterize prior clustering algorithms. Experiments on datasets from multiple domains demonstrate that the presented algorithm outperforms state-of-the-art clustering schemes, including recent methods that use deep networks.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes