MLIRLGApr 13, 2014

Anytime Hierarchical Clustering

arXiv:1404.3439v122 citations
Originality Incremental advance
AI Analysis

This provides an incremental improvement for researchers and practitioners needing scalable, distributed hierarchical clustering for dynamic data.

The authors tackled the problem of hierarchical clustering by proposing an anytime method that iteratively refines an initial hierarchy to produce nested partitions satisfying homogeneity requirements, with evidence suggesting it enables decentralized, scalable algorithms for large, dynamic datasets.

We propose a new anytime hierarchical clustering method that iteratively transforms an arbitrary initial hierarchy on the configuration of measurements along a sequence of trees we prove for a fixed data set must terminate in a chain of nested partitions that satisfies a natural homogeneity requirement. Each recursive step re-edits the tree so as to improve a local measure of cluster homogeneity that is compatible with a number of commonly used (e.g., single, average, complete) linkage functions. As an alternative to the standard batch algorithms, we present numerical evidence to suggest that appropriate adaptations of this method can yield decentralized, scalable algorithms suitable for distributed/parallel computation of clustering hierarchies and online tracking of clustering trees applicable to large, dynamically changing databases and anomaly detection.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes