Categorization Axioms for Clustering Results
This work addresses a foundational open problem in machine learning and data mining by providing a theoretical framework that could standardize and improve clustering methods, though it appears incremental as it builds on existing theories.
The paper tackles the challenge of establishing a unified axiomatic framework for data clustering by proposing categorization axioms that clustering results should satisfy, and shows that these axioms are consistent with cognitive science theories and lead to principles for designing clustering algorithms and validity indices.
Cluster analysis has attracted more and more attention in the field of machine learning and data mining. Numerous clustering algorithms have been proposed and are being developed due to diverse theories and various requirements of emerging applications. Therefore, it is very worth establishing an unified axiomatic framework for data clustering. In the literature, it is an open problem and has been proved very challenging. In this paper, clustering results are axiomatized by assuming that an proper clustering result should satisfy categorization axioms. The proposed axioms not only introduce classification of clustering results and inequalities of clustering results, but also are consistent with prototype theory and exemplar theory of categorization models in cognitive science. Moreover, the proposed axioms lead to three principles of designing clustering algorithm and cluster validity index, which follow many popular clustering algorithms and cluster validity indices.