Partitioning Relational Matrices of Similarities or Dissimilarities using the Value of Information
This method addresses clustering challenges for researchers dealing with similarity or dissimilarity data, offering an incremental improvement by automating cluster number selection.
The paper tackles the problem of clustering relational matrices without requiring a priori specification of cluster numbers, using a value-of-information criterion that enables deterministic annealing, resulting in data-driven phase changes and often identifying the global-best partition.
In this paper, we provide an approach to clustering relational matrices whose entries correspond to either similarities or dissimilarities between objects. Our approach is based on the value of information, a parameterized, information-theoretic criterion that measures the change in costs associated with changes in information. Optimizing the value of information yields a deterministic annealing style of clustering with many benefits. For instance, investigators avoid needing to a priori specify the number of clusters, as the partitions naturally undergo phase changes, during the annealing process, whereby the number of clusters changes in a data-driven fashion. The global-best partition can also often be identified.