From graph cuts to isoperimetric inequalities: Convergence rates of Cheeger cuts on data clouds
This provides foundational theoretical guarantees for clustering algorithms in machine learning, addressing a gap in statistical analysis for manifold data.
The paper tackles the problem of quantifying convergence rates for graph-based clustering algorithms using Cheeger cuts on data sampled from smooth manifolds, obtaining high probability convergence rates for both the Cheeger constant and cuts towards their continuum counterparts.
In this work we study statistical properties of graph-based clustering algorithms that rely on the optimization of balanced graph cuts, the main example being the optimization of Cheeger cuts. We consider proximity graphs built from data sampled from an underlying distribution supported on a generic smooth compact manifold $M$. In this setting, we obtain high probability convergence rates for both the Cheeger constant and the associated Cheeger cuts towards their continuum counterparts. The key technical tools are careful estimates of interpolation operators which lift empirical Cheeger cuts to the continuum, as well as continuum stability estimates for isoperimetric problems. To our knowledge the quantitative estimates obtained here are the first of their kind.