LG AI CV MLMay 19, 2020

A New Validity Index for Fuzzy-Possibilistic C-Means Clustering

Mohammad Hossein Fazel Zarandi, Shahabeddin Sotudian, Oscar Castillo

arXiv:2005.09162v13.38 citations

Originality Incremental advance

AI Analysis

This work addresses the challenge of determining optimal clustering parameters in noisy data for researchers in data mining and pattern recognition, representing an incremental improvement over existing fuzzy validity indices.

The paper tackles the problem of conflicting cluster validity indices in noisy datasets by introducing a new Fuzzy-Possibilistic (FP) index for fuzzy-possibilistic c-means clustering, which shows improved performance on synthetic and real-world datasets compared to existing indices.

In some complicated datasets, due to the presence of noisy data points and outliers, cluster validity indices can give conflicting results in determining the optimal number of clusters. This paper presents a new validity index for fuzzy-possibilistic c-means clustering called Fuzzy-Possibilistic (FP) index, which works well in the presence of clusters that vary in shape and density. Moreover, FPCM like most of the clustering algorithms is susceptible to some initial parameters. In this regard, in addition to the number of clusters, FPCM requires a priori selection of the degree of fuzziness and the degree of typicality. Therefore, we presented an efficient procedure for determining their optimal values. The proposed approach has been evaluated using several synthetic and real-world datasets. Final computational results demonstrate the capabilities and reliability of the proposed approach compared with several well-known fuzzy validity indices in the literature. Furthermore, to clarify the ability of the proposed method in real applications, the proposed method is implemented in microarray gene expression data clustering and medical image segmentation.

View on arXiv PDF

Similar