LGMLNov 3, 2019

Clustering in Partially Labeled Stochastic Block Models via Total Variation Minimization

arXiv:1911.00958v24 citations
Originality Incremental advance
AI Analysis

This work addresses clustering problems in network data analysis for researchers and practitioners, but it is incremental as it builds on existing stochastic block model frameworks with partial labels.

The paper tackles clustering in partially labeled stochastic block models by using total variation minimization, achieving accurate clustering under specific model parameter conditions and implementing a scalable message-passing protocol.

A main task in data analysis is to organize data points into coherent groups or clusters. The stochastic block model is a probabilistic model for the cluster structure. This model prescribes different probabilities for the presence of edges within a cluster and between different clusters. We assume that the cluster assignments are known for at least one data point in each cluster. In such a partially labeled stochastic block model, clustering amounts to estimating the cluster assignments of the remaining data points. We study total variation minimization as a method for this clustering task. We implement the resulting clustering algorithm as a highly scalable message-passing protocol. We also provide a condition on the model parameters such that total variation minimization allows for accurate clustering.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes