A Modular Spatial Clustering Algorithm with Noise Specification
This addresses the issue of parameter tuning in clustering for data mining and machine learning applications, though it appears incremental as it builds on existing algorithms like DBSCAN.
The paper tackles the problem of clustering algorithms requiring hard-to-estimate input parameters by proposing Bacteria-Farm, a novel algorithm that balances performance and ease of finding optimal parameters, with a modular design and noise specification feature.
Clustering techniques have been the key drivers of data mining, machine learning and pattern recognition for decades. One of the most popular clustering algorithms is DBSCAN due to its high accuracy and noise tolerance. Many superior algorithms such as DBSCAN have input parameters that are hard to estimate. Therefore, finding those parameters is a time consuming process. In this paper, we propose a novel clustering algorithm Bacteria-Farm, which balances the performance and ease of finding the optimal parameters for clustering. Bacteria- Farm algorithm is inspired by the growth of bacteria in closed experimental farms - their ability to consume food and grow - which closely represents the ideal cluster growth desired in clustering algorithms. In addition, the algorithm features a modular design to allow the creation of versions of the algorithm for specific tasks / distributions of data. In contrast with other clustering algorithms, our algorithm also has a provision to specify the amount of noise to be excluded during clustering.