Demystifying Information-Theoretic Clustering
This work addresses a fundamental flaw in information-theoretic clustering methods, which is significant for researchers in machine learning and data analysis, though it appears incremental as it builds on axiomatic foundations.
The authors tackled the problem of clustering data using information-theoretic principles without parametric assumptions, revealing that existing mutual information-based methods degrade with more data and proposing a new measure based on consistency under coarse-graining for finite data.
We propose a novel method for clustering data which is grounded in information-theoretic principles and requires no parametric assumptions. Previous attempts to use information theory to define clusters in an assumption-free way are based on maximizing mutual information between data and cluster labels. We demonstrate that this intuition suffers from a fundamental conceptual flaw that causes clustering performance to deteriorate as the amount of data increases. Instead, we return to the axiomatic foundations of information theory to define a meaningful clustering measure based on the notion of consistency under coarse-graining for finite data.