OCMLMar 16, 2016

Exact Clustering of Weighted Graphs via Semidefinite Programming

arXiv:1603.05296v52 citations
Originality Incremental advance
AI Analysis

This provides a theoretical guarantee for exact clustering in weighted graphs, which is incremental as it extends existing methods to handle noise and outliers more robustly.

The paper tackles the densest k-disjoint-clique problem for clustering weighted graphs by using a semidefinite programming relaxation, proving exact recovery with high probability under conditions like concentrated within-cluster weights and moderate outliers, and showing recovery of clusters as small as polylogarithmic in graph size in sparse graphs.

As a model problem for clustering, we consider the densest k-disjoint-clique problem of partitioning a weighted complete graph into k disjoint subgraphs such that the sum of the densities of these subgraphs is maximized. We establish that such subgraphs can be recovered from the solution of a particular semidefinite relaxation with high probability if the input graph is sampled from a distribution of clusterable graphs. Specifically, the semidefinite relaxation is exact if the graph consists of k large disjoint subgraphs, corresponding to clusters, with weight concentrated within these subgraphs, plus a moderate number of outliers. Further, we establish that if noise is weakly obscuring these clusters, i.e, the between-cluster edges are assigned very small weights, then we can recover significantly smaller clusters. For example, we show that in approximately sparse graphs, where the between-cluster weights tend to zero as the size n of the graph tends to infinity, we can recover clusters of size polylogarithmic in n. Empirical evidence from numerical simulations is also provided to support these theoretical phase transitions to perfect recovery of the cluster structure.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes