Bounds on Perfect Node Classification: A Convex Graph Clustering Perspective
This work addresses node classification in graph-based machine learning, offering an incremental improvement by combining graph structure with node-specific information for better recovery guarantees.
The paper tackles the transductive node classification problem by proposing a novel optimization problem that integrates node-specific information into spectral graph clustering, demonstrating that suitable node-specific information guarantees perfect community recovery under milder conditions than graph clustering alone.
We present an analysis of the transductive node classification problem, where the underlying graph consists of communities that agree with the node labels and node features. For node classification, we propose a novel optimization problem that incorporates the node-specific information (labels and features) in a spectral graph clustering framework. Studying this problem, we demonstrate a synergy between the graph structure and node-specific information. In particular, we show that suitable node-specific information guarantees the solution of our optimization problem perfectly recovering the communities, under milder conditions than the bounds on graph clustering alone. We present algorithmic solutions to our optimization problem and numerical experiments that confirm such a synergy.