LGCYMar 9, 2024

Addressing Shortcomings in Fair Graph Learning Datasets: Towards a New Benchmark

arXiv:2403.06017v28 citationsh-index: 16Has CodeKDD
AI Analysis

This work addresses a critical issue for researchers in fair graph learning by providing improved datasets, though it is incremental as it builds on existing evaluation needs without introducing new methods.

The authors tackled the problem of poorly constructed datasets for evaluating fair graph learning methods by developing a collection of synthetic, semi-synthetic, and real-world datasets with controllable bias parameters, and their systematic evaluations demonstrated effectiveness in benchmarking these methods.

Fair graph learning plays a pivotal role in numerous practical applications. Recently, many fair graph learning methods have been proposed; however, their evaluation often relies on poorly constructed semi-synthetic datasets or substandard real-world datasets. In such cases, even a basic Multilayer Perceptron (MLP) can outperform Graph Neural Networks (GNNs) in both utility and fairness. In this work, we illustrate that many datasets fail to provide meaningful information in the edges, which may challenge the necessity of using graph structures in these problems. To address these issues, we develop and introduce a collection of synthetic, semi-synthetic, and real-world datasets that fulfill a broad spectrum of requirements. These datasets are thoughtfully designed to include relevant graph structures and bias information crucial for the fair evaluation of models. The proposed synthetic and semi-synthetic datasets offer the flexibility to create data with controllable bias parameters, thereby enabling the generation of desired datasets with user-defined bias values with ease. Moreover, we conduct systematic evaluations of these proposed datasets and establish a unified evaluation approach for fair graph learning models. Our extensive experimental results with fair graph learning methods across our datasets demonstrate their effectiveness in benchmarking the performance of these methods. Our datasets and the code for reproducing our experiments are available at https://github.com/XweiQ/Benchmark-GraphFairness.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes