A New Homogeneity Inter-Clusters Measure in SemiSupervised Clustering
This work addresses the challenge of enhancing clustering effectiveness in data mining for applications where labeled data are scarce, representing an incremental improvement.
The paper tackles the problem of improving accuracy in semi-supervised clustering by introducing a new homogeneity measure for similarity computation, resulting in significantly improved accuracy as demonstrated in experiments.
Many studies in data mining have proposed a new learning called semi-Supervised. Such type of learning combines unlabeled and labeled data which are hard to obtain. However, in unsupervised methods, the only unlabeled data are used. The problem of significance and the effectiveness of semi-supervised clustering results is becoming of main importance. This paper pursues the thesis that muchgreater accuracy can be achieved in such clustering by improving the similarity computing. Hence, we introduce a new approach of semisupervised clustering using an innovative new homogeneity measure of generated clusters. Our experimental results demonstrate significantly improved accuracy as a result.