LGSep 28, 2021

Clustering to the Fewest Clusters Under Intra-Cluster Dissimilarity Constraints

Jennie Andersen, Brice Chardin, Mohamed Tribak

arXiv:2109.13644v1

Originality Synthesis-oriented

AI Analysis

This addresses clustering challenges where error bounds are critical, but it appears incremental as it builds on existing clustering frameworks with a new constraint.

The paper tackles the equiwide clustering problem, which partitions data into clusters with intra-cluster dissimilarity constraints to bound error when replacing objects with cluster representatives, and focuses on minimizing the number of clusters while evaluating trade-offs in practical solutions.

This paper introduces the equiwide clustering problem, where valid partitions must satisfy intra-cluster dissimilarity constraints. Unlike most existing clustering algorithms, equiwide clustering relies neither on density nor on a predefined number of expected classes, but on a dissimilarity threshold. Its main goal is to ensure an upper bound on the error induced by ultimately replacing any object with its cluster representative. Under this constraint, we then primarily focus on minimizing the number of clusters, along with potential sub-objectives. We argue that equiwide clustering is a sound clustering problem, and discuss its relationship with other optimization problems, existing and novel implementations as well as approximation strategies. We review and evaluate suitable clustering algorithms to identify trade-offs between the various practical solutions for this clustering problem.

View on arXiv PDF

Similar