LGNAJul 25, 2023

DBGSA: A Novel Data Adaptive Bregman Clustering Algorithm

arXiv:2307.14375v111 citationsh-index: 10
Originality Incremental advance
AI Analysis

This work addresses clustering challenges in data analysis, particularly for non-convex datasets, though it appears incremental as it builds on existing methods like Bregman divergence and gravitational algorithms.

The paper tackles the sensitivity of traditional clustering algorithms to initial centroids and poor performance on non-convex datasets by proposing DBGSA, a data-driven Bregman clustering algorithm that improves accuracy by an average of 63.8% compared to similar approaches.

With the development of Big data technology, data analysis has become increasingly important. Traditional clustering algorithms such as K-means are highly sensitive to the initial centroid selection and perform poorly on non-convex datasets. In this paper, we address these problems by proposing a data-driven Bregman divergence parameter optimization clustering algorithm (DBGSA), which combines the Universal Gravitational Algorithm to bring similar points closer in the dataset. We construct a gravitational coefficient equation with a special property that gradually reduces the influence factor as the iteration progresses. Furthermore, we introduce the Bregman divergence generalized power mean information loss minimization to identify cluster centers and build a hyperparameter identification optimization model, which effectively solves the problems of manual adjustment and uncertainty in the improved dataset. Extensive experiments are conducted on four simulated datasets and six real datasets. The results demonstrate that DBGSA significantly improves the accuracy of various clustering algorithms by an average of 63.8\% compared to other similar approaches like enhanced clustering algorithms and improved datasets. Additionally, a three-dimensional grid search was established to compare the effects of different parameter values within threshold conditions, and it was discovered the parameter set provided by our model is optimal. This finding provides strong evidence of the high accuracy and robustness of the algorithm.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes