LGAIJul 31, 2021

Grain: Improving Data Efficiency of Graph Neural Networks via Diversified Influence Maximization

arXiv:2108.00219v160 citations
Originality Highly original
AI Analysis

This work addresses data efficiency for GNNs in domains like social networks and e-commerce, representing a novel integration of previously parallel research areas.

The paper tackles the problem of data selection for Graph Neural Networks (GNNs) by introducing Grain, a framework that connects data selection with social influence maximization, resulting in significant improvements in performance and efficiency for tasks like active learning and core-set selection on public datasets.

Data selection methods, such as active learning and core-set selection, are useful tools for improving the data efficiency of deep learning models on large-scale datasets. However, recent deep learning models have moved forward from independent and identically distributed data to graph-structured data, such as social networks, e-commerce user-item graphs, and knowledge graphs. This evolution has led to the emergence of Graph Neural Networks (GNNs) that go beyond the models existing data selection methods are designed for. Therefore, we present Grain, an efficient framework that opens up a new perspective through connecting data selection in GNNs with social influence maximization. By exploiting the common patterns of GNNs, Grain introduces a novel feature propagation concept, a diversified influence maximization objective with novel influence and diversity functions, and a greedy algorithm with an approximation guarantee into a unified framework. Empirical studies on public datasets demonstrate that Grain significantly improves both the performance and efficiency of data selection (including active learning and core-set selection) for GNNs. To the best of our knowledge, this is the first attempt to bridge two largely parallel threads of research, data selection, and social influence maximization, in the setting of GNNs, paving new ways for improving data efficiency.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes