LGApr 13, 2013

A New Homogeneity Inter-Clusters Measure in SemiSupervised Clustering

arXiv:1304.3840v11 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of enhancing clustering effectiveness in data mining for applications where labeled data are scarce, representing an incremental improvement.

The paper tackles the problem of improving accuracy in semi-supervised clustering by introducing a new homogeneity measure for similarity computation, resulting in significantly improved accuracy as demonstrated in experiments.

Many studies in data mining have proposed a new learning called semi-Supervised. Such type of learning combines unlabeled and labeled data which are hard to obtain. However, in unsupervised methods, the only unlabeled data are used. The problem of significance and the effectiveness of semi-supervised clustering results is becoming of main importance. This paper pursues the thesis that muchgreater accuracy can be achieved in such clustering by improving the similarity computing. Hence, we introduce a new approach of semisupervised clustering using an innovative new homogeneity measure of generated clusters. Our experimental results demonstrate significantly improved accuracy as a result.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes