Snapshot Spectral Clustering -- a costless approach to deep clustering ensembles generation
This work addresses the problem of improving clustering accuracy for large, complex datasets in unsupervised machine learning, though it appears incremental in combining existing techniques.
The paper tackles the challenge of unsupervised learning from large datasets by proposing Snapshot Spectral Clustering, a deep clustering ensemble method that combines multiple data views from deep neural networks to enhance clustering results while minimizing computational costs. Experiments demonstrate the effectiveness of the proposed concept, with a hyperparameter study providing guidance for parameter selection.
Despite tremendous advancements in Artificial Intelligence, learning from large sets of data in an unsupervised manner remains a significant challenge. Classical clustering algorithms often fail to discover complex dependencies in large datasets, especially considering sparse, high-dimensional spaces. However, deep learning techniques proved to be successful when dealing with large quantities of data, efficiently reducing their dimensionality without losing track of underlying information. Several interesting advancements have already been made to combine deep learning and clustering. Still, the idea of enhancing the clustering results by combining multiple views of the data generated by deep neural networks appears to be insufficiently explored yet. This paper aims to investigate this direction and bridge the gap between deep neural networks, clustering techniques and ensemble learning methods. To achieve this goal, we propose a novel deep clustering ensemble method - Snapshot Spectral Clustering, designed to maximize the gain from combining multiple data views while minimizing the computational costs of creating the ensemble. Comparative analysis and experiments described in this paper prove the proposed concept, while the conducted hyperparameter study provides a valuable intuition to follow when selecting proper values.