Modified Possibilistic Fuzzy C-Means Algorithm for Clustering Incomplete Data Sets
This work addresses a domain-specific problem for researchers and practitioners in data clustering who need to handle incomplete datasets, representing an incremental improvement over existing methods.
The authors tackled the problem of clustering incomplete data sets by modifying the Possibilistic Fuzzy C-Means (PFCM) algorithm, proposing OCSPFCM and NPSPFCM algorithms, and found that NPSPFCM performed better in terms of accuracy percentage, number of iterations, and centroid errors.
Possibilistic fuzzy c-means (PFCM) algorithm is a reliable algorithm has been proposed to deal the weakness of two popular algorithms for clustering, fuzzy c-means (FCM) and possibilistic c-means (PCM). PFCM algorithm deals with the weaknesses of FCM in handling noise sensitivity and the weaknesses of PCM in the case of coincidence clusters. However, the PFCM algorithm can be only applied to cluster complete data sets. Therefore, in this study, we propose a modification of the PFCM algorithm that can be applied to incomplete data sets clustering. We modified the PFCM algorithm to OCSPFCM and NPSPFCM algorithms and measured performance on three things: 1) accuracy percentage, 2) a number of iterations to termination, and 3) centroid errors. Based on the results that both algorithms have the potential for clustering incomplete data sets. However, the performance of the NPSPFCM algorithm is better than the OCSPFCM algorithm for clustering incomplete data sets.