LGSIMar 3, 2022

Graph Representation Learning Beyond Node and Homophily

arXiv:2203.01564v120 citationsh-index: 7
Originality Highly original
AI Analysis

This addresses the problem of task-agnostic graph embeddings for researchers and practitioners in machine learning, offering a novel approach that is not incremental but introduces a new paradigm for handling both node and edge tasks.

The paper tackles the limitation of existing graph representation learning methods that rely on node homophily, which perform poorly on tasks like edge classification, by proposing PairE, an unsupervised method using paired nodes as the basic unit to retain high-frequency signals. The result shows PairE outperforms state-of-the-art baselines with up to 101.1% relative improvement on edge classification and up to 82.5% on node classification tasks.

Unsupervised graph representation learning aims to distill various graph information into a downstream task-agnostic dense vector embedding. However, existing graph representation learning approaches are designed mainly under the node homophily assumption: connected nodes tend to have similar labels and optimize performance on node-centric downstream tasks. Their design is apparently against the task-agnostic principle and generally suffers poor performance in tasks, e.g., edge classification, that demands feature signals beyond the node-view and homophily assumption. To condense different feature signals into the embeddings, this paper proposes PairE, a novel unsupervised graph embedding method using two paired nodes as the basic unit of embedding to retain the high-frequency signals between nodes to support node-related and edge-related tasks. Accordingly, a multi-self-supervised autoencoder is designed to fulfill two pretext tasks: one retains the high-frequency signal better, and another enhances the representation of commonality. Our extensive experiments on a diversity of benchmark datasets clearly show that PairE outperforms the unsupervised state-of-the-art baselines, with up to 101.1\% relative improvement on the edge classification tasks that rely on both the high and low-frequency signals in the pair and up to 82.5\% relative performance gain on the node classification tasks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes