SIAIJul 9, 2025

Scalable Attribute-Missing Graph Clustering via Neighborhood Differentiation

arXiv:2507.13368v21 citationsh-index: 28ICML
Originality Incremental advance
AI Analysis

This addresses clustering in real-world graphs like social networks, which are often large and have missing attributes, but it is incremental as it builds on existing multi-view clustering methods.

The paper tackles the problem of clustering nodes in large-scale attribute-missing graphs by proposing CMV-ND, a method that preprocesses graph structural information into complementary views, and it shows significant performance improvements on six datasets.

Deep graph clustering (DGC), which aims to unsupervisedly separate the nodes in an attribute graph into different clusters, has seen substantial potential in various industrial scenarios like community detection and recommendation. However, the real-world attribute graphs, e.g., social networks interactions, are usually large-scale and attribute-missing. To solve these two problems, we propose a novel DGC method termed \underline{\textbf{C}}omplementary \underline{\textbf{M}}ulti-\underline{\textbf{V}}iew \underline{\textbf{N}}eighborhood \underline{\textbf{D}}ifferentiation (\textit{CMV-ND}), which preprocesses graph structural information into multiple views in a complete but non-redundant manner. First, to ensure completeness of the structural information, we propose a recursive neighborhood search that recursively explores the local structure of the graph by completely expanding node neighborhoods across different hop distances. Second, to eliminate the redundancy between neighborhoods at different hops, we introduce a neighborhood differential strategy that ensures no overlapping nodes between the differential hop representations. Then, we construct $K+1$ complementary views from the $K$ differential hop representations and the features of the target node. Last, we apply existing multi-view clustering or DGC methods to the views. Experimental results on six widely used graph datasets demonstrate that CMV-ND significantly improves the performance of various methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes