LGSIOct 8, 2023

Data-centric Graph Learning: A Survey

arXiv:2310.04987v334 citationsh-index: 12
Originality Synthesis-oriented
AI Analysis

It provides a comprehensive overview for researchers in graph learning, but is incremental as it surveys existing methods rather than introducing new ones.

This survey tackles the problem of improving graph learning by shifting focus from model-centric to data-centric approaches, reviewing methods to modify graph data (topology, feature, label) to enhance model performance and address data issues.

The history of artificial intelligence (AI) has witnessed the significant impact of high-quality data on various deep learning models, such as ImageNet for AlexNet and ResNet. Recently, instead of designing more complex neural architectures as model-centric approaches, the attention of AI community has shifted to data-centric ones, which focuses on better processing data to strengthen the ability of neural models. Graph learning, which operates on ubiquitous topological data, also plays an important role in the era of deep learning. In this survey, we comprehensively review graph learning approaches from the data-centric perspective, and aim to answer three crucial questions: (1) when to modify graph data, (2) what part of the graph data needs modification to unlock the potential of various graph models, and (3) how to safeguard graph models from problematic data influence. Accordingly, we propose a novel taxonomy based on the stages in the graph learning pipeline, and highlight the processing methods for different data structures in the graph data, i.e., topology, feature and label. Furthermore, we analyze some potential problems embedded in graph data and discuss how to solve them in a data-centric manner. Finally, we provide some promising future directions for data-centric graph learning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes