LSEC: Large-scale spectral ensemble clustering
This work addresses the problem of scaling ensemble clustering to large datasets for researchers and practitioners, though it appears incremental as it builds on existing spectral and bipartite graph techniques.
The paper tackles the efficiency bottleneck in large-scale ensemble clustering by proposing LSEC, a method that balances efficiency and effectiveness, achieving lower computational complexity than most existing methods as shown in experiments on ten large-scale datasets.
Ensemble clustering is a fundamental problem in the machine learning field, combining multiple base clusterings into a better clustering result. However, most of the existing methods are unsuitable for large-scale ensemble clustering tasks due to the efficiency bottleneck. In this paper, we propose a large-scale spectral ensemble clustering (LSEC) method to strike a good balance between efficiency and effectiveness. In LSEC, a large-scale spectral clustering based efficient ensemble generation framework is designed to generate various base clusterings within a low computational complexity. Then all based clustering are combined through a bipartite graph partition based consensus function into a better consensus clustering result. The LSEC method achieves a lower computational complexity than most existing ensemble clustering methods. Experiments conducted on ten large-scale datasets show the efficiency and effectiveness of the LSEC method. The MATLAB code of the proposed method and experimental datasets are available at https://github.com/Li- Hongmin/MyPaperWithCode.