Graph Sensitive Indices for Comparing Clusterings
This addresses the limitation of existing clustering comparison methods for researchers in data analysis, though it appears incremental as it builds on the Variation of Information metric.
The authors tackled the problem of comparing clusterings by introducing two new indices, Random Walk index (RWI) and Variation of Information with Neighbors (VIN), which incorporate data point positions rather than just set cardinality, and they demonstrated results on example datasets.
This report discusses two new indices for comparing clusterings of a set of points. The motivation for looking at new ways for comparing clusterings stems from the fact that the existing clustering indices are based on set cardinality alone and do not consider the positions of data points. The new indices, namely, the Random Walk index (RWI) and Variation of Information with Neighbors (VIN), are both inspired by the clustering metric Variation of Information (VI). VI possesses some interesting theoretical properties which are also desirable in a metric for comparing clusterings. We define our indices and discuss some of their explored properties which appear relevant for a clustering index. We also include the results of these indices on clusterings of some example data sets.