SILGOct 22, 2020

Joint Use of Node Attributes and Proximity for Semi-Supervised Classification on Graphs

arXiv:2010.11536v2
AI Analysis

This work addresses the need for flexible node classification methods that can adapt to varying relationships between labels, attributes, and network structure, offering an incremental improvement over existing approaches.

The authors tackled the problem of node classification on graphs where the homophily assumption may not hold, proposing JANE, a generative probabilistic model that jointly weighs node attributes and proximity via embeddings, which demonstrated versatility and competitive performance across various network datasets.

The task of node classification is to infer unknown node labels, given the labels for some of the nodes along with the network structure and other node attributes. Typically, approaches for this task assume homophily, whereby neighboring nodes have similar attributes and a node's label can be predicted from the labels of its neighbors or other proximate (i.e., nearby) nodes in the network. However, such an assumption may not always hold -- in fact, there are cases where labels are better predicted from the individual attributes of each node rather than the labels of its proximate nodes. Ideally, node classification methods should flexibly adapt to a range of settings wherein unknown labels are predicted either from labels of proximate nodes, or individual node attributes, or partly both. In this paper, we propose a principled approach, JANE, based on a generative probabilistic model that jointly weighs the role of attributes and node proximity via embeddings in predicting labels. Our experiments on a variety of network datasets demonstrate that JANE exhibits the desired combination of versatility and competitive performance compared to standard baselines.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes