Asymptotic Behavior of Mean Partitions in Consensus Clustering
This provides a mathematical foundation for consensus clustering, which is important for researchers and practitioners in data analysis, though it is incremental as it builds on existing theory.
The paper addresses the lack of proven consistency for mean partitions in consensus clustering, showing that under normal assumptions, the mean partition approach is consistent and asymptotically normal, justifying its use with finite but sufficiently large sample sizes.
Although consistency is a minimum requirement of any estimator, little is known about consistency of the mean partition approach in consensus clustering. This contribution studies the asymptotic behavior of mean partitions. We show that under normal assumptions, the mean partition approach is consistent and asymptotic normal. To derive both results, we represent partitions as points of some geometric space, called orbit space. Then we draw on results from the theory of Fréchet means and stochastic programming. The asymptotic properties hold for continuous extensions of standard cluster criteria (indices). The results justify consensus clustering using finite but sufficiently large sample sizes. Furthermore, the orbit space framework provides a mathematical foundation for studying further statistical, geometrical, and analytical properties of sets of partitions.