Random Models for Fuzzy Clustering Similarity Measures
This work addresses a methodological gap for researchers using similarity measures in fuzzy clustering, though it is incremental as it builds on existing extensions of the Rand Index.
The authors tackled the problem of selecting appropriate random models for the Adjusted Rand Index (ARI) in fuzzy clustering, proposing a unified framework with three intuitive models that offer lower computational complexity and distinct behaviors, as demonstrated on synthetic and benchmark data.
The Adjusted Rand Index (ARI) is a widely used method for comparing hard clusterings, but requires a choice of random model that is often left implicit. Several recent works have extended the Rand Index to fuzzy clusterings, but the assumptions of the most common random model is difficult to justify in fuzzy settings. We propose a single framework for computing the ARI with three random models that are intuitive and explainable for both hard and fuzzy clusterings, along with the benefit of lower computational complexity. The theory and assumptions of the proposed models are contrasted with the existing permutation model. Computations on synthetic and benchmark data show that each model has distinct behaviour, meaning that accurate model selection is important for the reliability of results.