STLGNov 19, 2021

An Asymptotic Equivalence between the Mean-Shift Algorithm and the Cluster Tree

arXiv:2111.10298v12 citations
Originality Incremental advance
AI Analysis

This work clarifies the theoretical relationship between foundational clustering methods, which is incremental but important for researchers in nonparametric statistics and machine learning.

The paper resolves a conundrum by showing that two classic nonparametric clustering approaches—the cluster tree and gradient flow—are fundamentally equivalent, proposing two natural ways to derive a partition from the cluster tree and proving they reduce to the gradient flow partition under standard density assumptions.

Two important nonparametric approaches to clustering emerged in the 1970's: clustering by level sets or cluster tree as proposed by Hartigan, and clustering by gradient lines or gradient flow as proposed by Fukunaga and Hosteler. In a recent paper, we argue the thesis that these two approaches are fundamentally the same by showing that the gradient flow provides a way to move along the cluster tree. In making a stronger case, we are confronted with the fact the cluster tree does not define a partition of the entire support of the underlying density, while the gradient flow does. In the present paper, we resolve this conundrum by proposing two ways of obtaining a partition from the cluster tree -- each one of them very natural in its own right -- and showing that both of them reduce to the partition given by the gradient flow under standard assumptions on the sampling density.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes