LGAIMay 6, 2024

E2GNN: Efficient Graph Neural Network Ensembles for Semi-Supervised Classification

arXiv:2405.03401v12 citations
AI Analysis

This work addresses the problem of poor inference efficiency and sub-optimal performance in GNN ensembles for researchers and practitioners in graph machine learning, representing an incremental improvement.

This paper tackles the challenge of efficiently and effectively ensembling graph neural networks (GNNs) for semi-supervised classification by proposing E2GNN, which uses a multi-layer perceptron (MLP) to mimic GNN predictions and a reinforced discriminator to filter errors, achieving improved performance across 8 benchmark datasets.

This work studies ensemble learning for graph neural networks (GNNs) under the popular semi-supervised setting. Ensemble learning has shown superiority in improving the accuracy and robustness of traditional machine learning by combining the outputs of multiple weak learners. However, adopting a similar idea to integrate different GNN models is challenging because of two reasons. First, GNN is notorious for its poor inference ability, so naively assembling multiple GNN models would deteriorate the inference efficiency. Second, when GNN models are trained with few labeled nodes, their performance are limited. In this case, the vanilla ensemble approach, e.g., majority vote, may be sub-optimal since most base models, i.e., GNNs, may make the wrong predictions. To this end, in this paper, we propose an efficient ensemble learner--E2GNN to assemble multiple GNNs in a learnable way by leveraging both labeled and unlabeled nodes. Specifically, we first pre-train different GNN models on a given data scenario according to the labeled nodes. Next, instead of directly combing their outputs for label inference, we train a simple multi-layer perceptron--MLP model to mimic their predictions on both labeled and unlabeled nodes. Then the unified MLP model is deployed to infer labels for unlabeled or new nodes. Since the predictions of unlabeled nodes from different GNN models may be incorrect, we develop a reinforced discriminator to effectively filter out those wrongly predicted nodes to boost the performance of MLP. By doing this, we suggest a principled approach to tackle the inference issues of GNN ensembles and maintain the merit of ensemble learning: improved performance. Comprehensive experiments over both transductive and inductive settings, across different GNN backbones and 8 benchmark datasets, demonstrate the superiority of E2GNN.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes