NEAIMAMLJun 10, 2021

Swarm Intelligence for Self-Organized Clustering

arXiv:2106.05521v176 citations
Originality Incremental advance
AI Analysis

It addresses the problem of clustering without prior knowledge for data scientists, offering a parameter-free alternative that can identify meaningless clustering, though it appears incremental in combining existing swarm intelligence concepts.

The paper introduces Databionic swarm (DBS), a swarm intelligence system for clustering high-dimensional data without a global objective function or parameters, which outperforms common methods like K-means and spectral clustering when no prior knowledge is available, as shown on benchmark data and real-world applications.

Algorithms implementing populations of agents which interact with one another and sense their environment may exhibit emergent behavior such as self-organization and swarm intelligence. Here a swarm system, called Databionic swarm (DBS), is introduced which is able to adapt itself to structures of high-dimensional data characterized by distance and/or density-based structures in the data space. By exploiting the interrelations of swarm intelligence, self-organization and emergence, DBS serves as an alternative approach to the optimization of a global objective function in the task of clustering. The swarm omits the usage of a global objective function and is parameter-free because it searches for the Nash equilibrium during its annealing process. To our knowledge, DBS is the first swarm combining these approaches. Its clustering can outperform common clustering methods such as K-means, PAM, single linkage, spectral clustering, model-based clustering, and Ward, if no prior knowledge about the data is available. A central problem in clustering is the correct estimation of the number of clusters. This is addressed by a DBS visualization called topographic map which allows assessing the number of clusters. It is known that all clustering algorithms construct clusters, irrespective of the data set contains clusters or not. In contrast to most other clustering algorithms, the topographic map identifies, that clustering of the data is meaningless if the data contains no (natural) clusters. The performance of DBS is demonstrated on a set of benchmark data, which are constructed to pose difficult clustering problems and in two real-world applications.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes